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Abstract 

Random matrices tend to be well conditioned, and we employ this well known property 
to advance matrix computations. We prove that our algorithms employing Gaussian random 
matrices are efficient, but in our tests the algorithms have consistently remained as powerful 
where we used sparse and structured random matrices, defined by much fewer random parame- 
ters. We numerically stabilize Gaussian elimination with no pivoting as well as block Gaussian 
elimination, precondition an ill conditioned linear system of equations, compute numerical rank 
of a matrix without pivoting and orthogonalization, approximate the singular spaces of an ill 
conditioned matrix associated with its largest and smallest singular values, and approximate 
this matrix with low-rank matrices, with applications to its 2 x 2 block triangulation and to 
tensor decomposition. Some of our results and techniques can be of independent interest, e.g., 
our estimates for the condition numbers of random Toeplitz and circulant matrices and our 
variations of the Sherman-Morrison- Woodbury formula. 

2000 Math. Subject Classification: 15A52, 15A12, 15A06, 65F22, 65F05, 65F10 
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1 Introduction 

It is well known that random matrices tend to be well conditioned [D88| . |E88j . |ES05j . |CD05| , 
[SST06j . |B11| . and we employ this property to advance matrix computations. We prove that 
with probability 1 or near 1 our techniques of randomized preprocessing precondition a large and 
important class of ill conditioned matrices. By employing randomization we stabilize numerically 
Gaussian elimination with no pivoting as well as block Gaussian elimination, precondition an ill 
conditioned linear system of equations, compute numerical rank of a matrix using no pivoting and 

*Some results of this paper have been presented at the ACM-SIGSAM International Symposium on Symbolic and 
Algebraic Computation (ISSAC '2011), San Jose, CA, 2011, the 3nd International Conference on Matrix Methods in 
Mathematics and Applications (MMMA 2011) in Moscow, Russia, June 22-25, 2011, the 7th International Congress 
on Industrial and Applied Mathematics (ICIAM 2011), in Vancouver, British Columbia, Canada, July 18-22, 2011, 
the SIAM International Conference on Linear Algebra, in Valencia, Spain, June 18-22, 2012, and the Conference on 
Structured Linear and Multilinear Algebra Problems (SLA2012), in Leuven, Belgium, September 10-14, 2012 
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orthogonalization, approximate the singular spaces of an ill conditioned matrix A associated with 
its largest and smallest singular values, approximate this matrix by low-rank matrices, compute its 
2x2 block triangulation, and compute a Tensor Train approximation of a tensor. Our analysis 
and experiments show substantial progress versus the known algorithms. In our tests most of our 
techniques have fully preserved their power when we dramatically decreased the number of random 
parameters involved, which should motivate further research. Some of our results and techniques 
can be of independent interest, e.g., our estimates for the condition numbers of random Toeplitz and 
circulant matrices and our extensions of the Sherman-Morrison- Woodbury formula. 

1.1 Numerically safe Gaussian elimination with no pivoting 

Hereafter "flop" stands for "arithmetic operation" , "expected" and "likely" mean "with probability 
1 or close to 1", crj(A) denotes the jth largest singular value of an n x n matrix A, and the ratio 
k{A) = ai{A)/ap{A) for p = rank(A) denotes its condition number. k{A) = \\A\\ \\A~^\\ if p — n, 
that is if A is a nonsingular matrix. If this number is large in context, then the matrix A is ill 
conditioned, otherwise well conditioned. For matrix inversion and solving linear systems of equations 
the condition number represents the output magnification of input errors. 



and backward error analysis implies similar magnification of rounding errors |GL96] . |H02] . |S98] . 

To avoid dealing with singular or ill conditioned matrices in Gaussian elimination, one incorpo- 
rates pivoting, that is row or column interchange. Gaussian elimination with no pivoting (hereafter 
we refer to it as GENP) can easily fail in numerical computations with rounding errors, except for 
some special input classes such as the classes of diagonally dominant and positive definite matrices. 
For such matrices, GENP outperforms Gaussian elimination with pivoting |GL96[ page 119]. By 
extending our previous study in [PGMQ] Section 12.2] and |PQZa| , we expand these classes dra- 
matically by proving that pre- and post-multiplication of a well conditioned coefficient matrix of 
full rank by a square Gaussian random matrix is expected to yield safe numerical performance of 
GENP as well as block Gaussian elimination (see Remark l4.ip . The results of our tests support this 
theoretical finding consistently. Furthermore the tests show that the power of our preprocessing is 
fully preserved where we use just circulant or Householder multipliers generated by a vector or a pair 
of vectors, respectively, and where we fill these vectors with integers ±1 and limit randomization to 
the choice of the signs ± (see our Table [TO^ and |PQZa[ Table 2]). 

1.2 Randomized preconditioning: the basic theorem 

Given an ill conditioned matrix A, can we extend our progress by applying randomized multipliers 
X and Y to yield a much better conditioned matrix product XAYl No, because random square 
matrices X and Y are expected to be nonsingular and well conditioned |D88] , |E88) , |ES05| , [CD05| , 
[SST06| . [Bllj and because k{XAY) > -^y^^y)- Approximate inverses are popular multipliers but 
only where it is not costly to compute them, that is only for some special, although important classes 
of matrices A, except for the surprising empirical technique in (R90] (see our Rematk l9^ . 

We can readily produce a well conditioned matrix C by applying additive preprocessing A 
C = A + P, e.g., by choosing P = I — A, but it is not clear how this could help us to solve a 
linear system Ay = b. (Here and hereafter / denotes the identity matrix.) Assume, however, that 
we are given a nonsingular ill conditioned n x n matrix A together with a small upper bound r on 
its numerical nullity nnul(^), that is on the number of its singular values that are much smaller 
than the 2-norm ||A||2. Such matrices make up a large and important subclass of nonsingular ill 
conditioned matrices (cf. |CDG03j and Remark [^?T|) . for which randomized additive preconditioning 
is supported by the following theorem. In Section [5] we prove it, extend it to rectangular matrices 
A, and further detail its estimates. 
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Theorem 1.1. Suppose A is a real n x n matrix having a numerical rank p (that is the ratio 
crp_|_i(A)/||A|| is small, hut the ratio (Tp(y4.)/||A|| is not small), all and aV are standard Gaussian 
random n x r matrices, whose all 2nr entries are independent of each other, 0<p<n, 0<r<n, 
and C = A + UV^ . Then (i) we can expect that 

0.5||A||2 < IICII2 < l.5\\A\\2 if, say a/\\A\\2 < ^. (1.2) 

(ii) Furthermore the matrix C is singular or ill conditioned if r < n — p, (Hi) hut otherwise it 
is nonsingular with probability 1 and (iv) is expected to be well conditioned if the ratio a/\\A\\2 is 
neither large nor small, e.g., if < a/\\A\\2 < 100. 

1.3 Randomized algorithms 

We recall some known randomized matrix algorithms |D83) . [HMTll] and study some new ones. 
Suppose we are given a normalized nonsingular n x n matrix A, such that ||A|| = 1, and its small 
positive numerical nullity r = nnul(j4) (cf. Section 18.21 on computing numerical nullity). Suppose 
also that we have generated two standard Gaussian random nx r matrices U and V and applied ran- 
domized additive preconditioning A C — A + UV'^ producing nonsingular and well conditioned 
matrix C, as we can expect by virtue of Theorem 11.11 Then we can apply the Sherman-Morrison— 
Woodbury formula, hereafter referred to as the SMW formula, to reduce an ill conditioned linear 
system Ay = b of n equations to well conditioned linear systems Cx = f . 

By virtue of (|l.ip we must perform our computations with high accuracy to obtain meaningful 
output where the matrix A is ill conditioned, but we can limit highly accurate computations with 
multiple or infinite precision to O(ri^) flops and performs the order of remaining fiops with 
the standard double or single IEEE precision versus order of n'^ high precision fiops in Gaussian 
elimination. It may be even more promising to combine randomized additive preprocessing and the 
SMW formula to compute multiplicative preconditioners of the matrix A (see Remark 19. 6|) . 

We also apply our approach numerically, with double precision, to the approximation of the 
singular spaces associated with the p = n — r largest and the r smallest singular values of an ill 
conditioned matrix A that has a positive numerical nullity r = nnul(A). We rely on the following 
observations. Suppose the matrix C of Theorem ll.ll is nonsingular and well conditioned, as expected. 
Then (a) we can readily compute the nx r matrices C~^U and C~"^y by solving 2r linear systems 
of equations with the matrices C and C'^ and (b) the ranges of the matrices C~^U and C~'^V 
approximate closely the left and right trailing singular spaces, respectively, associated with the r 
smallest singular values of an ill conditioned matrix A. Furthermore we approximate the left and 
right leading singular spaces associated with the p largest singular values of the matrix A, both 
by extending these techniques and by means of random sampling A AH and A =^ A^G 
for Gaussian random n x matrices G and H and for small positive integers p+ — p |HMTll] . 
We extend these randomized algorithms to compute numerical rank of a matrix using no pivoting 
or orthogonalization, its 2 x 2 block triangulation and its low-rank approximation with a further 
extension to tensor decomposition. 

1.4 Sparse and structured matrix computations, randomized augmenta- 
tion, and numerical experiments 

We cannot extend our proof of Theorem 1 1.1 1 to the case of Gaussian random Toeplitz matrices U and 
V , but such extension has been supported consistently by the results of our numerical tests. In these 
tests our algorithms remained as efficient where we replaced Gaussian random matrices by matrices 
defined by much fewer random parameters such as the circulant and Householder multipliers of 
Section [Ol and the block vectors U with blocks ±/ and O in the additive preprocessing A =^ G = 
A + UU^ in Section [TO In these cases we limited randomization to the choice of the block sizes in 
the block vectors and of the signs ±. We have amended our algorithms to preserve matrix sparseness 
and structure, in particular for computing numerical rank and nullity. 
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Additive preprocessing A C = A + UV^ preserves matrix structure and sparseness quite well 
where the matrices U and V have consistent structure and sparseness, but both random sampling and 

randomized augmentation, such as the maps A A^G and A K — ^ ^ ^ , achieve this 

even more perfectly where we properly extend the patterns of sparseness and structure of the matrix 
A to the random matrices G, U, V, and W. For V = —U and symmetric positive definite matrices 
A and K the above augmentation increases the condition number k{A), in contrast to additive 
preprocessing A C = A + VV^ where A is a nonnegative definite matrix. Other than that, 
however, we observe close similarity between augmentation and additive preprocessing, we prove 
similar preconditioning properties of both maps under randomization and extend to augmentation 
the SMW and dual SMW formulae as well as Theorem 11.11 and our expressions for the bases of 
leading and traifing singular spaces (see Section [6l Corollaries 17.21 and 17.41 and equations (|6.2p and 
(|7.4p ). Then again our tests were in good accordance with all these extensions even in the case of 
sparse and structured matrices G, U , V, and VF, defined by a small number of random parameters, 
although our formal proofs only apply to the case of Gaussian random matrices G, U, V, and W. 

Other than that our tests for randomized multiplicative and additive preprocessing and augmen- 
tation were in good accordance with our theoretical estimates. By applying randomized additive 
preprocessing to ill conditioned matrices we consistently observed dramatic decrease of the condition 
numbers (see Tabic [T0?5|) . and this enabled accurate solution of the respective ill conditioned linear 
systems of equations (see Tables fTO . 7H 1 . 1 Ol and fTO . 1 5 p . whereas random circulant multipliers filled 
with ±1 have stabilized GENP numerically (see Table 110. 6p . Furthermore according to our test 
results, our algorithms approximated accurately the leading and trailing singular spaces of ill condi- 
tioned matrices and approximated a matrix by a low-rank matrix (see Tables [TO.llHl 0.14)) . We have 
also matched the output accuracy of the customary algorithms for solving ill conditioned Toeplitz 
linear systems of equations but outperformed them dramatically in terms of the CPU time (see 
Table 110. 16p . Finally our experimental data on the condition numbers of Gaussian random Toeplitz 
and circulant matrices are in good accordance with our formal estimates (see Sections 13.41 and 13.51 
and Tables I10.1H10.4| . These estimates respond to a challenge in |SST06| and can be surprising 
because the paper [BG05| has proved that the condition numbers of n x n Toeplitz matrices grow 
exponentially in n as n — >■ oo for some large and important classes of Toeplitz matrices, whereas we 
prove the opposit for Gaussian random Toeplitz matrices. 



1.5 Organization of the paper and selective reading 

We recall some definitions and basic results on matrix computations in the next section. We estimate 
the condition numbers of Gaussian random general, Toeplitz and circulant matrices in Section[3]and 
of randomized matrix products in Section |4l The latter estimate imply that randomized multipliers 
support GENP and block Gaussian elimination. In Section [5] we prove that our randomized addi- 
tive preprocessing of an ill conditioned matrix is expected to produce a well conditioned matrix. 
In Section [6] we prove a similar property of randomized augmentation, which we link to random- 
ized additive preprocessing and apply to the solution of ill conditioned Toeplitz linear systems of 
equations. 

In Sections [7]-[9] we apply randomization to computations with ill conditioned matrices having 
small numerical nullities or ranks. We compute numerical rank of such matrix using no pivoting or 
orthogonalization, approximate its trailing and leading singular spaces, approximate it by a low-rank 
matrix, and point out applications to tensor decomposition and to approximation by structured ma- 
trices. We also apply our randomized additive preconditioning to compute 2x2 block triangulation 
of an ill conditioned matrix A, to compute its inverse, and to precondition a linear system Ay = b. 
We comment on randomized computations with structured and sparse inputs in Section [8l 

In Section [10] we cover numerical tests, which constitute the contribution of the second and the 
third authors. In Section [Tl] we comment on the related works, our progress, and some directions 
for further study. In Appendix A we recall some results on structured matrices. In Appendix 
B we estimate the probability that a random matrix has full rank under the uniform probability 
distribution. In Appendix C we comment on the extension of our probabilistic estimates to the case 
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of complex matrices. 

The paper can be partitioned into two parts: the next five sections, Section [TT] and the Appendix 
cover basic theorems and make up Part 1, whereas Sections [T HlOl on the algorithms and tests, 
make up Part 2. The correctness proofs of the algorithms of Part 2 employ the results of Part 
1, but otherwise Part 2 can be read independently of Part 1. In selective reading one can skip 
the subjects of structured matrices and tensors (in particular Sections 12.71 12.81 13.41 13.51 16.51 El 
110.31 110.61 a-nd Appendix A) or augmentation (Section [6] and all related materials), or can read 
only selected algorithmic material, e.g., on low-rank approximation by means of random sampling 
(Proto- Algorithms 17.11 and 18.11 Section 110.41 and the supporting results) or on preprocessing that 
supports GENP and block Gaussian elimination fTheorems l2.5l[3T2] and l4?Tl Remark [4.1 1 and Section 
110. 3p . In the paper, unlike its introduction, we study the general case of rectangular input matrices 
A, but again one may restrict oneself to the simpler special case of square matrices. 

2 Some definitions and basic results 

We assume computations in the field M of real numbers, and comment on the extension to the field 
C of complex numbers in Appendix [Cj 

Hereafter "flop" stands for "arithmetic operation"; "expected" and "likely" mean "with prob- 
ability 1 or close to 1", and the concepts "large", "small", "near", "closely approximate", "ill 
conditioned" and "well conditioned" are quantified in the context. For two scalars a and b we write 
a ^ h and & ^ a if the ratio \h/a\ is large. We write a ~ 6 if |a — &| ^ |a| -I- |&|. Next we recall and 
extend some customary definitions of matrix computations |GL96| . |S98| . 

2.1 Some basic definitions on matrix computations 

gmxn j^YiQ class of real m x n matrices A = {ai,j)^j^. 

(Bi I ... I Bk) — (i?j)^=i is a 1 X fc block matrix with blocks Bi, . . . , Bk- diag(_Bi, . . . , Bk) = 
dia,g{Bj)'j^^ is a fc X fc block diagonal matrix with diagonal blocks Bi, . . . , Bk- 

Gi is the ith coordinate vector of dimension n for i = 1, . . . , n. These vectors define the identity 
matrix J„ = (ei | . . . | e„) and the reflection matrix J„ = (e„ | . . . | ei), both of size n x n. Ok,i 
is the fc X / matrix filled with zeros. 0^ is the vector Ok,i- We write /, J, O, and where the size 
of a matrix or a vector is not important or is defined by context. Furthermore we write 

Ig,h = Ig where g < h, whereas IgM = [h \ Oh^g-h) where g > h. (2.1) 

is the transpose of a matrix A. A^ is its Hermitian transpose. A matrix A is symmetric if 
A — and is symmetric positive definite \i A = B^B for a real nonsingular matrix B. 

A real matrix Q is called orthogonal if Q^Q = I or QQ^ = I. More generally, over the complex 
field C a matrix U is called unitary if U^U — I or UU^ — I. Hereafter Q{A) denote a unique 
orthogonal matrix specified by the following result. 

Fact 2.1. fGL96[ Theorem 5.2.2]. QR factorization A = QR of a matrix A having full column rank 
into the product of an orthogonal matrix Q = Q{A) and an upper triangular matrix R — R{A) is 
unique provided that the factor R is a square matrix with positive diagonal entries. 

2.2 Range, null space, rank, nullity, nmbs, and generic rank profile 

TZiA) denotes the range of an m x n matrix A, that is the linear space {z : z = Ax.} generated 
by its columns. A/'(A) denotes its null space {v : Av — 0}, rank(yl) = dim7?.(A) its rank, and 
nul(A) — dimN'{A) = n — rank(A) its right nullity or just nullity, whereas nul(A"^) — m — rank(A) 
is the left nullity of A, equal to nul(A) if and only if m = n. v is the null vector of A if Av = 0. 

Fact 2.2. The set M of mxn matrices of rank p is an algebraic variety of dimension (m + n — p)p. 
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Proof. Let M he an m x n matrix of a rank p with a nonsingular leading p y. p block A/qo £^nd 

write M — i^?'^ ^'^^1 • Then the (to — p) x in — p) Schur complement Mn — Mi^MZ} Mqi must 
\^iWio Mil J 

vanish, which imposes (m — p){n — p) algebraic equations on the entries of M. Similar argument 
can be applied where any p x p submatrix of the matrix M (among ( ^ j [pj ^^'^^ submatrices) is 
nonsingular. Therefore dimM — mn — (to — p){n — p) = (to + n — p)p. □ 

A matrix _B is a matrix cover for its range TZ{B). A matrix cover is a matrix basis (for its range) 
if it has full column rank. A matrix basis B for the null space J^{A) is a null matrix basis or a 
nmb for the matrix A, and we write B = nmb(^). J\f{A'^) is the left null space of a matrix A, and 
similarly the map A A'^ defines left null vectors, left nmbs, and the left nullity of a matrix A. 

(k) 

Aj, denotes the leading, that is northwestern k x k block submatrix of a matrix A. A matrix of a 
rank p has generic rank profile if all its leading i x i blocks are nonsingular for i — 1, . . . , p. If such 
matrix is nonsingular itself, then it is called strongly nonsingular. 

2.3 Norms, SVD, and singular spaces 



is the ft,-norm and ||^||f — \/'^Tj=i Frobenius norm of a matrix A = (ai.j)™'"i- 



We write ||A|| = ||A||2 and ||v|| = Vv'^v = ||v||2 and recall from jGL96| Section 2.3.2 and Corollary 
2.3.2] that 

ma.x^jl-^\aij\ < \\A\\ = ||A^|| < y/rnn max™;"Jaij|, 
^PI|i<||^||<V^PI|i, Plli^ll^^lU, \\Ar<\\A\\i\\A\U (2.2) 



TO 

\\A\\<\\A\\f<V^\\A\\, (2.3) 

< for = l,2,oo and any matrix product AB. (2.4) 

A matrix A is normalized if ||A|| = 1. A normalized vector is orthogonal (unitary), and we call it 
unit. We write A^ B if\\A- B\\ < \\A\\ + \\B\\. 

Define an SVD or full SVD of an to x n matrix A of a rank p as follows, 

A = Sa^aT^- (2.5) 

Here SaS'^ = S'^Sa = Im, TaTJ^ = T^Ta = /„, ^a = diag(SA, Om-p,«-p), = diag(crj(A))^^;^, 
aj = crj{A) = (Tj(A^) is the jth largest singular value of a matrix A for j — 1, . . . , p, and we write 
cTj = for j > p. These values have the minimax property 

c^j^^.max ^ min, , | |Ax| |, j = 1, . . . , p, (2.6) 

dim(s)— J xGs, ||x|| — 1 

where S denotes linear spaces |GL961 Theorem 8.6.1]. Consequently ap > 0, fii = max||x||^i ||^x|| — 
\\A\l and 

II diag(Mj)j|| — max ||Mj|| for any set of matrices Mj. (2-7) 

j 

Fact 2.3. If Aq is a submatrix of a matrix A, then crj{A) > (Tj(Ao) for all j. 

Proof. [GL96| Corollary 8.6.3] implies the claimed bound where Aq is any block of columns of the 
matrix A. Transposition of a matrix and permutations of its rows and columns do not change 
singular values, and thus we can extend the bounds to all submatrices Aq. □ 

Theorem 2.1. We have \(jj{C) — crj{C + E)\ < \\E\\ for all m x n matrices C and E and all j. 

Proof See [GL961 Corollary 8.6.2] or [SMl Corollary 4.3.2]. □ 
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In Sections[7Hn]we use the following definitions. For every integer k in the range 1 < k < rank(^) 
define the partition Sa = {Sk,A \ SA,m-k) and Ta = (Tk.A \ TA,n-k) where the submatrices Sk.A 
and Tk^A are formed by the first k columns of the matrices Sa and Ta, respectively. Write 'Sk.A = 
diag((Tj (A))^^;^, §k,A = TZ{Sk,A) and Tfe,^ = TZ{Tk^A)- If ffc > crfc+i, then Sk,A and Tk^A are the left 
and right leading singular spaces, respectively, associated with the k largest singular values of the 
matrix A, whereas their orthogonal complements SA,m-k = Ti{S A.m-k) and TA,n-k — 'Ti{TA,n-k) 
are the left and right trailing singular spaces, respectively, associated with the other singular values 
of A. The pairs of subscripts {fc, A} versus {A,m — k} and {A,n — fc} mark the leading versus 
trailing singular spaces. The left singular spaces of A are the right singular spaces of A^ and vice 
versa. All matrix bases for the singular spaces 'Bk.A and Tk,A are given by matrices Sk.A^ and 
Tk,A^ , respectively, for nonsingular k x k matrices X and Y. Orthogonal matrices X and Y define 
orthogonal matrix bases for these spaces. B is an approximate matrix basis for a space § within a 
relative error norm bound r if there exists a matrix E such that _B + _E is a matrix basis for this 
space S and if ||£'|| < T||i?||. 



2.4 Inverses, generalized inverses, and perturbation bounds 

= Ta diag(S^^, 0„_p_m_p)S'J is the Moore-Penrose pseudo-inverse of the matrix A of (|2.5p . and 

\\A+\\^l/ap{A) (2.8) 

for a matrix ^ of a rank p. A+'^ stands for {A+)'^ = and A~'^ stands for (A"^)^ = (^^)~^ 

An n X m matrix X — A^^' is a left inverse of an m x n matrix A if XA = I and is its right 
inverse if AX = /. is a left or right inverse if and only if a matrix A has full rank. A'^-'^ is 
unique and is equal to A^^ if A is a nonsingular matrix. Theorem 12. II implies the following bound. 

Theorem 2.2. Suppose two matrices C,C + E e C"^" have full rank. Then \ \{C + E)+ - C+|| < 
||i?|| \\{C + E)+ C+\\. 

This bound can be improved where the matrices C and C + E are nonsingular. 

Theorem 2.3. Suppose C and C' + E are two nonsingular matrices of the same size and \ \C~^E\\ = 
< 1. Then \\I ~ {C + E)-^C\\ < and \\\{C + E)-^ - C'^W < j^\\C-'^\\, in particular 
\\\iC + E)-^ - C'^W < 0.5||C-i|| if0< 1/3. 

Proof See [SMl Corollary 1.4.19] for P = -C-^E. □ 



2.5 SMW and dual SMW formulae 

Theorem 2.4. \GL9a page 50], fSM Corollary 4.3.2]. Suppose that U,V e R"^'', the matrices 
A e R"^" and C = A + UV'^ are nonsingular, and < r < n. Then the matrix G = Ir ~ V'^C~^U 
is nonsingular and we have the Sherman-Morrison- Woodbury (hereafter SMW ) formula 

A-^ = C-^ + C-^UG-^V^C-\ 

Corollary 2.1. Suppose that U-,V- e R"^«, A e M"><", A and A-^ + U-V^ are nonsingular 
matrices, and < q < n. Write 

C'Z^ = A-^ + U-VT, H = Ig + V'EAU-, (2.9) 

Then the matrix H is nonsingular and the following dual SMW formula holds, 

C- ^ A- AU-H-'^V^A. (2.10) 

Proof. Apply Theorem 12.41 to the matrices A~^^, V- and CZ^ replacing the matrices A, U, V 
and C, respectively. □ 
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2.6 Condition number, numerical rank and numerical nullity, generic con- 
ditioning profile 

k{A) = ~ 1 1^1 1 11^^ 1 1 condition number of an m x n matrix A of a rank p. Such matrix 

is ill conditioned if cri(A) 3> o'p^A) and is well conditioned otherwise. See |D83| . |GL961 Sections 
2.3.2, 2.3.3, 3.5.4, 12.5], [H02l Chapter 15], |KL94j . and [S98, Section 5.3] on the estimation of norms 
and condition numbers of nonsingular matrices. 

An m X n matrix A has numerical rank nrank(A), not exceeding rank(A), and has the right 
numerical nullity nnul(yl) — n — nrank(A) or just numerical nullity if the ratios crj(A)/||A|| are 
small for j > nrank(A) but not for j < nrank(yl) . The left numerical nullity of the matrix A equals 
the numerical nullity nnul(^"^) = m — nrank(A) of the n x m transpose and coincides with the 
numerical nullity of A if and only if m = n. 

Remark 2.1. One can specify the adjective "small" above as "smaller than a fixed positive toler- 
ance". The choice of the tolerance can be a challenge, e.g., for the matrix diag(l.l~-')^?fo. 

If a well conditioned m x n matrix A has a rank p < I = min{TO,n}, then almost all its close 
neighbours have full rank I (see Section 13. 2p , and all of them have numerical rank p. Conversely, 
suppose a matrix A has a positive numerical rank p = nrank(A) and truncate its SVD by setting 
to all its singular values, except for the p largest ones. Then the resulting matrix A — E is well 
conditioned and has rank p and = (7p+i(A), and so A — E is a, rank-p approximation to the 
matrix A within the error norm bound ap+i{A). At a lower computational cost we can obtain rank-p 
approximations of the matrix A from its rank- revealing factorizations jGE96j , |HP92j , |P00a| , and 
we further decrease the computational cost by applying randomized algorithms in Sections [7] and [8l 

An mx n matrix has generic conditioning profile (cf. the end of Section [2.2[) if it has a numerical 
rank p and if its leading i x i blocks are nonsingular and well conditioned for i = 1, . . . , p. If such 
matrix has full rank (that is if p = min{rn,n}) and if it is well conditioned itself, then we call it 
strongly well conditioned. The following theorem shows that GENP and block Gaussian elimination 
applied to a strongly well conditioned matrix are numerically safe. 

Theorem 2.5. Cf. \PQZa\ Theorem 5.1]. Assume GENP or block Gaussian elimination applied to 
an n X n matrix A and write N — \ \A\\ and — max"^j^ ||(j4j"'^)~"'"||. Then the absolute values of 
all pivot elements of GENP and the norms of all pivot blocks of block Gaussian elimination do not 
exceed N + N^N"^ , whereas the absolute values of the reciprocals of these elements and the norms of 
the inverses of the blocks do not exceed . 



2.7 Toeplitz, Hankel and /-circulant matrices 

A Toeplitz m X n matrix T^.n ~ {l'i-j)T'i=i defined by its first row and column, that is by the 



vector (i/t)™^i_„ of dimension 



\7n,n 

1. We write r„ 



Tn,n = (t.-, )"'"=! (see mM). 



A lower triangular Toeplitz n x n matrix Z{t) = (ti-j)f (where tk — for fc < 0) is defined 
by its first column t — {th)^zl. We write Z(t)^ = (Z(t))'^. Z — Zq = Z{e2) is the downshift n x n 
(see (imi) ). We have Zv = (wOSi^ and Z(v) = Zo(v) = X;r=i v^Z'-^ for v = (w,)f^i and vq = 0, 



Tn 



( to 
ti 



t-1 

to 



tl-n\ 



t-1 

to ) 



z = 



1 



Vo 



(2.11) 



0/ 



Combine the equations ||.^(v) 



\Z{v) 



loo = ||v||i with (|2.2p to obtain 
l^(v)||<||v||i. 



(2.12) 
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Theorem 2.6. Write Tk = (i«-i)*'jio for k ^ n,n+ 1. 

(a) Let the matrix T„ be nonsingular and write p = T^^^ei and q = T^^en- If Pi = e^p ^ 0, 
then piT-^ = Z{p)Z{Jq)'^ - Z{Zq)Z{ZJpf. 

In parts (h) and (c) below let the matrix T^+i be nonsingular and write v = (wi)"^o ^ -^r7+i^i' 

V = ivi^I^, V = K)r=i' w = K)r=o = r-+\e„+i, w - {w,)^-^, and w' = {wi)U- 

(b) If Vq ^ 0, then the matrix r„ is nonsingular and foT„ ^ = Z(v)Z( Jw')"^ — Z(w)Z( Jv')"'". 
(^cj //fn 7^ 0, t/ien f/ie matrix Ti q = (ij-j)"^'i^Lo nonsingular and VnT^Q = Z(w)Z( Jv')-'" — 

Z(v)Z(Jw')^. 

Proo/. See |GS72| on parts (a) and (b); see |GK72| on part (c). □ 

Zf = Z + /e^e„ for a scalar / ^ denotes the n x n matrix of f- circular shift. An f-circulant 
matrix Zf{y) = X^ILi ''^i-^/"^ ^ special Toeplitz n x n matrix defined by its first colmxm vector 

V = {vi)2^i and a scalar /. /-circulant matrix is called circulant if / = 1 and skew circulant if 
/ = — 1. By replacing / with we arrive at a lower triangular Toeplitz matrix Z{\'). The following 
theorem implies that the inverses (wherever they are defined) and pairwise products of /-circulant 
n X n matrices are /-circulant and can be computed in O(nlogrt) flops. 

Theorem 2.7. (See \CPW74l ) We have Zi(v) = Vt-^ D{VL^f)Q.. More generally, for any f ^ 0, 
we have ^/-(v) = Uy'D{Ufv)Uf where Uf = nD{{), f = (/')^=o^ D{u) = diag(uOr=o^ for a 
vector u = {ui)^^Q , fl — {(^ri)"J^Q is the nx n matrix of the discrete Fourier transform at n points, 
uJn = exp(^ being a primitive n-th root of 1, and fl^^ — ^('^,7'"')"7=o ~ ■ 

Hankel m x n matrices H = {hi+jYl^-^i '-^'^ defined equivalently as the products H = rj„ 
or H = JmT oi m X n Toeplitz matrices T and the Hankel reflection matrices J — Jm or J„. Note 
that J ^ = and obtain the following simple fact. 

Fact 2.4. For m ^ n we have T = HJ, R-^ = JT^^ and T^^ = JR-^ if H = TJ, whereas 
T =^ JH , H^^ = JT~^ and T^^ = H^^J if H = JT . Furthermore in both cases k{II) = k{T). 

By using the equations above we can readily extend any Toeplitz matrix inversion algorithm 
to Hankel matrix inversion and vice versa, preserving the flop count and condition numbers. E.g. 
(jr)-i = T-i J, (rj)-i = JT-i, (JH)-^ = H-^J and (HJ)-^ = JH-\ 

2.8 Toeplitz-like, Hankel-like and some other structured matrices 

Let us extend the class of Toeplitz and Hankel matrices to a more general class of structured matrices, 
which we only employ in Section |8l With every pair of n x n operator matrices A and B associate 
the class of n x n matrices AI for which the rank of the Sylvester displacement AM — MB (called the 
displacement rank of M) is small in context. The matrices T with the structure of Toeplitz type (we 
call them Toeplitz-like matrices) have small displacement ranks d = d(A, B) for A = Z^ and B = Zf 
and for any pair of scalars e and /. Such matrices extend the class of Toeplitz matrices, for which 
d < 2. Any variation of a pair of scalars e and / can change the displacement rank of a matrix by 
at most 2, and so the class of Toeplitz-like matrices is independent of the choice of such pair. 

Every matrix of a rank d, and in particular a displacement of a rank d, can be nonuniquely 
represented as the sum of d outer products gjhj of d pairs of vectors gj and hj for j — 1, . . . ,d. 
Motivated by the following result we call the pair of matrices G = Gz^,Zf{M) = and 
H = IIz^,Zf (M) — (hj)^^]^, made up of the vectors gj and hj, a displacement generator of length d 
for the matrix M and for the operator matrices Z^ and Zf where e ^ f (cf. |P011 Example 4.4.2]). 

Theorem 2.8. // ZeM — MZf — Yl'j=i Sj'^J f^'"' of distinct scalars e and f , then 

d 

(e - f)M = Ze{Sj)ZfiJhj). (2.13) 
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The theorem expresses the matrix through the displacement generator {G, H} by using 2dn 
parameters instead of entries. For d n, this is a dramatic compression, which furthermore 
reduces multiphcation of the matrix M by a vector essentially to 2c? multiplications of circulant 
matrices by vectors, that is to 0{dn\ogn) flops. Moreover we can operate with matrices by using 
their displacement representation, which preserves Toeplitz-like structure and can accelerate the 
computations dramatically where c? <C n. For Toeplitz-like matrices T, Ti and T2, scalars e, /, a, 
and /3, and operator matrices A ~ Z^, B ~ Z/, and C = Zc, we can readily obtain the Toeplitz-like 
matrices (if the matrix T is nonsingular) , T"^, aTi -\- (5T2, and T1T2. The following theorem 
bounds the growth of the length of the associated displacement generators and the respective flop 
cost. 

Theorem 2.9. Assume that nx n matrices T , Ti, and T2 have been represented with their displace- 
ment generators of lengths di, d2, and d, respectively, for appropriate operator matrices A = 
and B — Z f , defining Toeplitz-like structure. Then there exist displacement generators of length d 
for T^^ (provided that the matrix T is nonsingular) and T'^ , of length di + d2 for aTi -\- PT2, and 
of length c?i -|- c?2 + 0{1) for T1T2 (for appropriate operator matrices defining Toeplitz-like structure 
and for any pair of scalar a and (3). One can compute these generators by using Oid^nlog^n) ana 
0{did2n\ogn) flops, respectively. 

Proof. The theorem readily follows from Theorem lA.il and CoroUarv lA.ll in Appendix \X[ which also 
define all the respective displacement generators. □ 

A matrix H is Hankel-like if rank(yliJ — HB) is small where A = Ze and B = Zj for two scalars 
e and / or alternatively where A = Zj and B = Zf. It follows that MN is a Hankel-like matrix 
if one of the factors is a Toeplitz-like matrix and another is a Hankel-like matrix, whereas MN is 
a Toeplitz-like matrix if both M and N are Hankel-like matrices or both are Toeplitz-like matrices. 
We can alternatively define Hankel-like matrices as the products TJ or JT where T is a Toeplitz- 
like matrix, or we can define Toeplitz-like matrices T as the products HJ and JH where H are 
Hankel-like matrices (cf. Fact 12. 4p . By using these properties we can readily extend our algorithms 
as well as expressions (I2.13P (cf. |P011 Example 4.4.4]) from the case of Toeplitz and Toeplitz-like 
to Hankel and Hankel-like matrices, preserving the flop count. 

Remark 2.2. By choosing the operator matrices A and B among f -circulant and appropriate di- 
agonal matrices we define the important classes of matrices M with the structures of Vandermonde 
and Cauchy types whose displacement rank, that is rank(v4M — MB), is small. This extends the 
classes of Vandermonde matrices = j^i, having displacement rank 1 for the operator ma- 

trices A ~ diag(a:i)"^]^ and B = Zj for a scalar f , and Cauchy matrices Cs.t — ( ^ . )^ j=i! having 
displacement rank 1 for the operator matrices A = diag(si)"^]^ and B — diag(tj)"^]^ . Alternatively 
fP90^ . IPOJJ, the matrices of these classes can be defined as the products U MV where M is a Toeplitz 
matrix, whereas U and V are properly selected among Vandermonde matrices, their transposes and 
the identity matrices. Similarly to the Toeplitz- Hankel link at the end of the previous subsection, this 
enables us to extend any successful algorithm for Cauchy-like inversion to Toeplitz-like, Hankel-like 
and Vandermonde-like inversion and vice versa because {U MV)^^ — V^^M^^U~^ fPQOf . although 
unlike the orthogonal reversion matrix J, Vandermonde multipliers and their transposes are usually 
ill conditioned except for a narrow but important class including the matrices fl and il^ of Theorem 
\2.1\ Theorems \2.^ and \2.9\ and other basic properties of Toeplitz-like and Hankel-like matrices can 
be extended to the matrices having structures of Vandermonde or Cauchy types (see JPOOf . \P01f or 
Appendix A ). 

3 Ranks and conditioning of Gaussian random matrices 

3.1 Random variables and Gaussian random matrices 

Definition 3.1. Fy{y) = Probability{j < y} for a real random variable 7 is the cumulative distri- 
bution function (cdf) 0/7 evaluated at y. Fg(^f^ a-){y) — „^2-n I-oo ^^P(~ ^^2a^^ )'^^ f^''" ^ Gaussian 
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random variable g{^,a) with a mean /i and a positive variance , and so 

fi — 4:(7<y<fi + 4:a with a probability near 1. (3-1) 

Definition 3.2. A matrix or a vector is a Gaussian random matrix or vector with a mean fi and a 
positive variance if it is filled with independent identically distributed Gaussian random variables, 
all having the mean fi and variance . ^J™^" '^f such Gaussian random mxn matrices 

(which are standard for = and = 1). By restricting this set to Toeplitz or f-circulant matrices 
we obtain the sets 7^^" and -Z^^'^ o/ Gaussian random Toeplitz and Gaussian random /-circulant 
matrices, respectively. 

Definition 3.3. Xti.a,niy) is the cdf of the norm ||v|| = (X]r=i )^^^ ^ Gaussian random vector 
V - (wOf^i e g;i^^\ Fory>0 we have xoaAv) = 2"/2rV/2) Jo exp(-a:V2)rfa; where T{h) = 
x'*^^ exp(— T{n + 1) = n! for nonnegative integers n. 

3.2 Nondegeneration of Gaussian random matrices 

The total degree of a multivariate monomial is the sum of its degrees in all its variables. The total 
degree of a polynomial is the maximal total degree of its monomials. 

Lemma 3.1. fDL78^ , \S80^ . \Z79^ . For a set A of a cardinality |A| in any fixed ring let a polynomial 
in m variables have a total degree d and let it not vanish identically on this set. Then the polynomial 
vanishes in at most rfjAI™^^ points. 

We assume that Gaussian random variables range over infinite sets A, usually over the real line 
or its interval. Then the lemma implies that a nonzero polynomial vanishes with probability 0. 
Consequently a square Gaussian random general, Toeplitz or circulant matrix is nonsingular with 
probability 1 because its determinant is a polynomials in the entries. Likewise rectangular Gaussian 
random general, Toeplitz and circulant matrices have full rank with probability 1. Furthermore 
all entries of such matrix A and of its adjoint adj A are subdeterminants and thus are nonzeros 
with probability 1. Clearly this property of the adjoint also holds for the inverse A^^ = i^t^ 
the matrix A is nonsingular. Hereafter, wherever this causes no confusion, we assume by default 
that Gaussian random general, Toeplitz and circulant matrices have full rank, and their inverses 
(if defined) have nonzero entries. These properties can be readily extended to the products of the 
latter matrices by nonsingular and orthogonal matrices, and further to various functions of general, 
sparse and structured matrices. Moreover similar properties hold with probability near 1 where the 
random variables are sampled under the uniform probability distribution from a finite set of a large 
cardinality (see Appendix A). 

3.3 Extremal singular values of Gaussian random matrices 

Besides having full rank with probability 1, Gaussian random matrices in Definition [3]2] are expected 
to be well conditioned [D88] . [E88] . [E5U5] . |CD05j . [BTT] . and even the sum M + A for M g R"^" 
and A S GJ^a" is expected to be well conditioned unless the ratio tT/||Af|| is small or large [SST06j . 

The following theorem states an upper bound proportional to y on the cdf f (y), that is 
on the probability that the smallest positive singular value 1/||A+|| = <ti{A) of a Gaussian random 
matrix A is less than a nonnegative scalar y (cf. (j2.8p ) and consequently on the probability that the 
norm ||A+|| exceeds a positive scalar x. The stated bound still holds if we replace the matrix A by 
A — B for any fixed matrix B, although for B — Om,n the bounds can actually be strengthened by 
a factor yl™-"l [ES05) . jCD05) . 

Theorem 3.1. Suppose A G ^J^^^", B G K'"^^", I = min{m,n}, x > Q, and y > 0. Then 
Fa,(A-B){v) < 2.35 ^^ly/a, that is Probability{\\{A ~ B)+\\ > 2.35xVl/cr} < l/x. 

Proof. For m — n this is [SST06| Theorem 3.3]. Apply Fact 12. 31 to extend it to any pair {m, n}. □ 
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The following two theorems supply lower bounds _F'||^||(z) and F^(^j^-^{y) on the probabilities that 
||A|| < z and n{A) < y for two scalars y and z, respectively, and a Gaussian random matrix A. We 
do not use the second theorem, but state it for the sake of completeness and only for square n x n 
matrices A. The theorems imply that the functions 1 — i^||yi||(z) and 1 — F^(^j^-^{y) decay as z ^ oo 
and y — > oo, respectively, and that the decays are exponential in —z^ and proportional to \/\ogy/y, 
respectively. For small values ya and a fixed n the lower bound of Theorem 13.31 becomes negative, 
in which case the theorem becomes trivial. Unlike Theorem 13.11 in both theorems we assume that 
^l = 0. 

Theorem 3.2. WSOli Theorem II.7]. Suppose A e GJ!^^", h = max{m,n} and z > 2aVh. Then 
^\\A\\{z) ^ 1 ~ exp(— (z — 2(tV7i)^/(2i7^)), and so the norm \\A\\ is expected to have order a^fh. 

Theorem 3.3. fSSTOa Theorem 3.1]. Suppose 0<ct<1, y>l, ^J^". Then the matrix A 
has full rank with probability 1 and F^(^j^-^{y) > 1 — (14.1 + A.^^J{2 hiy)/n)n/{ya). 

Proof. See }SST06| the proof of Lemma 3.2]. □ 



3.4 Extremal singular values of Gaussian random Toeplitz matrices 

A matrix T„ = {ti-jY^ is the sum of two triangular Toeplitz matrices 

r„ = Z{t) + Z{t^f, t = (to^o . t- = (t-.)7=o, to = 0. (3.2) 

If Tn e 7^"^", then Tn has 2n — 1 pairwise independent entries in Gfi.a- Thus (|2.12p implies that 

iir„ii < iiz(t)ii + iiz(t_)^ii < iit||i + iit_iii = ii(t,)r=^„iii < ii(^.)r=i-„ii- 

Recall Definition 13.21 and obtain 

F\\T„\\{y) > XM,'T,2«-i(y/V2^)- (3.3) 

Next we estimate the norm ||r^^|| for T!„ £ 7^^"- 
Lemma 3.2. fSST06[ Lemma A. 2]. For a nonnegative scalar y, a unit vector t G R"^^, and a 
vector b G g'^^^\ we have F^tT^iy) < ^ff . 

Remark 3.1. The latter bound is independent of fi and n; it holds for any fi even if all coordinates 
of the vector b are fixed except for a single coordinate in Q^^a- 

Theorem 3.4. Given a matrix T„ — (t,;-j)fj^x ^ 7^"^") assumed to be nonsingular (cf. Section 
\3.2il . write p I ~ e^T^^ei. Then F^i^y^_^j,-i^^{y) < 2naji for two random variables a and jS such that 

FM) < and Fp{y) < \[^^ for y > 0. (3.4) 

Proof Recall from part (a) of Theorem that piT"^ = Z{p)Z{Jq)'^ - Z{Zq)Z{ZJpf . Therefore 
|biT-i|| < ||^(p)|| ||Z(Jq)^|| + ||Z(Zq)|| ||Z(ZJp)^|| for p = T-^ei, q = T'^e™, andpi =p^ei. 
Itfollowsthat ||pir-i|| < ||Z(p)|| ||Z(Jq)|| + ||Z(Zq)|| ||Z(ZJp)|| since ||A|| = 1 1 for all matrices 
A. Furthermore \\piT-^\\ < ||p||i ||Jq||i + ||Zq||i ||ZJp||i due to (ET^ . Clearly ||Jv||i = ||v||i 
and ||^v||i < ||v||i for every vector v, and so (cf. (|2.2I) ) 

|biT-i|| <2||p||i ||q||^<2„||p|| ||q||. (3.5) 

By definition the vector p is orthogonal to the vectors T„e2, . . . , T„e„, whereas p-'^Tnei = 1 (cf. 
[SST06| ). Consequenty the vectors r„e2, . . . ,T„e„ uniquely define the vector u = p/||p||, whereas 
|u'^T„ei| — l/||p||. The last coordinate tn-i of the vector T„ei is independent of the vectors 
T!ne2, . . . , T„e„ and consequently of the vector u. Apply Remark 13.11 to estimate the cdf of the 

random variable a = l/||p|| = |u^T„ei| and obtain that Fa{y) < for y > 0. 
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Likewise the n — 1 column vectors Tei, . . . ,T„_i define the vector v ~ /3q for [3 — l/||q|| = 
|v'^T„e„|. The first coordinate ti_„ of the vector r„e„ is independent of the vectors Tei, . . . ,T„_i 
and consequently of the vector v. Apply Remark 13.11 to estimate the cdf of the random variable f3 

and obtain that Fp[y) < \J^^ for y > 0. Finally combine these bounds on the cdfs F^iy) and 

Fp{y) with (031). □ 

By applying parts (b) and (c) of Theorem 12.61 instead of its part (a) , we similarly deduce the 
bounds llfoT^rT+ill < 2a/3 and [["ynlV+ill < 2a/3 for two pairs of random variables a and /? that 
satisfy (|3.4p for n + 1 replacing n. We have pi = , vq = , and w„ — f^^j^"-! for 



detT„ ' ~ dctT„ + i' ~ dotT„+i 

7o,i = {ti-jYiZo'jLi- Next we bound the geometric means of the ratios | '^dctrt^ I for /i = 1, . . . , fc — 1. 
l/|pi| and l/|fo| are such ratios for k — n — \ and k — n, respectively, whereas the ratio l/lvnl is 
similar to l/|uo|, under slightly distinct notation. 

Theorem 3.5. Let Th ^ O denote h x h matrices for h = l,...,fc whose entries have absolute 
values at most t for a fixed scalar or random variable t, e.g. for t — \ \T\\. Furthermore let Ti — (t). 
Then the geometric mean (UlZl I '^aItT I)^^^''"^^ = ll detTfeji/^'^-i) is at most k^'^'^+^^k. 

Proof. The theorem follows from Hadamard's upper bound |detA/| < k^^'^t^, which holds for any 
k y. k matrix M = {mi.j)^ with maxf ^^j^ I'^ii.jl ^ ^- 

The theorem says that the geometric mean of the ratios | det Th+i/ del Th\ for h — 1, . . . , fc — 1 
is not greater than /j0.5+£(fe)^ where t{k) ~> as fc oo. Furthermore if T„ e ^J,' '^^ can write 
t — \ \T\\ and apply (|3.3p to bound the cdf of t. 

3.5 Extremal singular values of Gaussian random circulant matrices 

Next we estimate the norms of a random Gaussian /-circulant matrix and its inverse. 



Theorem 3.6. Assume y > and a circulant n x n matrix T = ^i(v) for v e GJ!^^. Th- 



en 



(o-) ^||T||(y) > Xt^,a.niy T^y) for Xt^.^Av) i'^ Definition\K^and (b) F^/\\T-^\{y) < y f ^■ 

Proof. For the matrix T = ^i(v) we have both equation p.2p and the bound ||t_||i < ||t||i, and 
so ||r||i < 2||t||i. Now part (a) of the theorem follows similarly to (13. 3p . To prove part (b) recall 
Theorem Upland write B = nTVL'^ = D{n), u = {uiYlI^ = rJv. We have aj{T) = aj{B) for all j 
because and ^/nfl^^ are unitary matrices. By combining the equations Ui = ejilv, the bounds 

||3?(ef 0)11 > 1 for all i, and Lemma deduce that F|s)f(tii)| (y) — \f^a for z = 1, . . . , n. We have 
Fa„{B){y) = ^min. \u,\iy) because B = diag(^ii)"^o^ ^nd clearly \u,\ > \di{ui)\. □ 



Remark 3.2. Our extensive experiments suggest that the estimates of Theorem 1 3. 6\ are overly pes- 
simistic (cf. Table^TU^. 

Combining Theorem 12 . 71 with minimax property p.6p implies that 

1 



gU) 



TjiZiM) < cj,{Zf{w)) < g{f)a,{Z^{v)) 



for all vectors v, scalars / 7^ 0, g{f) — max{|/p, l/|/p}, and j = l,...,n. Thus we can readily 
extend the estimates of Theorem 13.61 to /-circulant matrices for f ^ 0. In particular Gaussian 
random /-circulant matrices tend to be well conditioned unless / w or 1// sa 0. 
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4 Condition numbers of randomized matrix products and 
generic preconditioning 

Next we deduce probabilistic lower bounds on the smallest singular values of the products of fixed 
and random matrices. We begin with three lemmas. The first of them is obvious, the second easily 
follows from minimax property (j2.6p . 

Lemma 4.1. aj{SM) = <7j(MT) = a.j{M) for all j if S and T are square orthogonal matrices. 

Lemma 4.2. Suppose E = diag(cr,) -Li, di > era > • • • > cr„; G G M''^", H e K"^''. Then 
aj{GT,) > aj{G)(Tn, <Tj{YiH) > aj{H)an for all j. If also ct„ > 0, then rank(GS) = rank(G), 
rank(I]ff) = rank(i7). 

We employ the following result in the proof of Corollary [O] 

Corollary 4.1. We have k{AB) < k(A)k{B) if A or B is a nonsingular matrix. 

Proof. Assume SVDs A = Sa^aT^ of Then aj{AB) = (Tj{Sa'£'aTIB) = aj(T.AB) where 

B = T^B. Let A and consequently T,a be nonsingular nxn matrices. Apply Lemma 14.21 and deduce 
that ajC^AB) > <jj{B)<Jn{A), whereas <Jj{B) = (7j{B) for all j. We have p = rank(Ai3) = rank(i?) < 
n. Combine the relationships above for j = p and obtain that ap{AB) — ap{T,AB) > ap[B)an{A) — 
ap{B)an{A), and so (Tp{AB) > ap{B)<Tn{A). Also note that ||^-B|| < Combine the latter 

bounds and obtain that K(ylS) = \\AB\\/ap{AB) < \\A\\ \\B\\/ {ap{B)an{A)) ^ k{A)k{B). Similarly 
prove the claimed bound where i? is a nonsingular matrix. □ 

Lemma 4.3. ISST061 Proposition 2.2]. Suppose H e Q^^", SS^ ^ S^ S = I,n, TT^ = T^T = /„. 
Then SH G g™^" and HT e a"^". 

The following theorem implies that multiplication by standard Gaussian random matrix is un- 
likely to decrease the smallest positive singular value of a matrix dramatically, even though UV — O 
for some pairs of rectangular orthogonal matrices U and V . 

Theorem 4.1. Suppose G' E Gl^^J^ , H' e Ql'^J , M e R™^", G = G' + U , H = H' + V for 
some matrices U and V, r{M) ~ rank(A/), a; > and y > 0. Then i^i/| | (gm)+ 1 1 (y) < F{yyM,a) 
and i^i/||(A.m)+||(2/) < F{y,M,a) for F{y,M,a) ^ 2.35yVr\\M+\\/a andr = min{r,r(M)}, that is 
Probability{\\P+\\ > 2.35xVr\\M+\\/a} < 1/x for P = GM and P = MH. 

Proof. With probability 1, the matrix MH has rank f because H e GJ^^^J'. So (cf. 

Fi/\\{MH)+\\{y) =Fa^(MH){y)- (4.1) 

Let M = Sm^mTJi be full SVD where Ea/ = diag(EAf,0) = Ea/ diag(/^(A/), O) and Ea/ = 
diag(crj {M)Y^^P is a nonsingular diagonal matrix. We have MH = Sm^mTJ.jH , and so Oj {MH) = 
ajiTiMTj^jH) for all j by virtue of Lemma [4.11 because S]\j is a square orthogonal matrix. Write 
Hr(M) — {Ir{M) \ 0)T^ H and observe that aj{YiMTj,jH) — aj{TjMHr{M)) and consequently 

aj{MH) = a,{^MH,^M)) for all j. (4.2) 

Combine equation (|4.2p for j ~ r with Lemma [4.21 for the pair (E, ) replaced by (EA^,i?r(M)) 
and obtain that apiMH) > a^(^M)[M)af{H^(^M)) = (^r[Hr(M)) /\\M+\\. We have TjjH' e GJ^^J by 
virtue of Lemma 14.31 because Tm is a square orthogonal matrix; consequently H^^m) — -^^(a/) ~^ ^ 

for H^f^j^j^ e Gjilar^'^^^ and some matrix B. Therefore we can apply Theorem 13.11 for A — H'^^^j^j-^ 
and obtain the bound of Theorem 14.11 on -F'i/||(Af_ff)+||(y)- One can similarly deduce the bound on 
-P"i/||(GAf)+|| (y) or can just apply the above bound on -Fi/||(Af_ff)+||(y)) foi" H = G^ and M replaced 
by M^ and then recall that {M^G"^)^ = GM. □ 
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By combining (|2.4I) with Theorems 13.21 (for B ^ O) and 14.11 we can probabihsticaUy bound 
the condition numbers of randomized products GM and AIH. The folfowing corollary extends the 
bound of Theorem l4.1l for a randomized matrix product to the bounds for its leading blocks. 

Corollary 4.2. Suppose j, k, m, n, q and s are integers, I < j < q, 1 < k < s, AI €z R'"^", 

a > 0, G e H e rank(Mj) = j for M, = M { )' rank(M('=)) = k for 

— (/j, I Ok.m-k)M , and y > 0- Then (i) with probability 1 the matrix GM (resp. MH) has 
generic rank profile i/rank(Af) > q (resp. i/rank(M) > s). Furthermore (ii) F^^^^^^^-^j^{j)^j^^^{'y) < 

2.35yVj/{\\M+\\a) */rank(M) > ^^i/||((Mif)<^')+|| (f) ^ 2.35yVfc(||(Af('-))+||a) z/rank(M) > k. 

Proof. We immediately verify part (i) by applying the techniques of Section 13.21 To prove part (ii) 

apply Theorem 14.11 replacing G by {Ij \ Oj^q-j)G and replacing M by M ( ). For every k 

apply Theorem 14.11 replacing M by {Ik \ Ok,m-k)M and replacing iJ by iJ I „ J . □ 

Remark 4.1. It is well known that GENP and block Gaussian elimination are numerically unsafe 
where the input matrix M has a singular or ill conditioned leading block, but if this matrix itself is 
well conditioned, then the latter results combined with \2.4^ and Theorems \2.5\ and \3.2\ for B — O 
imply that multiplication by Gaussian random matrices is expected to fix this problem. Namely both 
elimination algorithms applied to the matrices GM and MH for G G G^^i"^ and H G ^oi" are 
expected to use no divisions by absolutely small values. 

Remark 4.2. We cannot extend the proofs of Lemma \4-^ o,nd consequently Theorem and its 
corollaries to the case of Gaussian random Toeplitz matrices G G 7^^™ '^'^'^ ^ ^ T^^^ , but the 
results of our tests have consistently supported such extensions (cf. Tables \l0.6\ and \l0.10\) . This is 
also the case for our results in the next two sections. 



5 Randomized additive and dual additive preconditioning 

In this section we prove Theorem 11.11 and extend it to the cases of rectangular m x n matrices A 
and dual additive preprocessing. At first we prove the following specification of the theorem where 
instead of the matrix A having numerical rank p we deal with its SVD truncation having rank p. 

Theorem 5.1. Suppose A is a real n x n matrix of a rank p, < p < n, all and aV are standard 
Gaussian random n x r matrices, whose all 2nr entries are independent of each other, < r < n, 
andC = A + UV^. Then (i) we have \\A\\2 - \\U\\2 \\V\\2 < \\G\\2 < \\A\\2 + \\U\\2 \\V\\2, which 
implies lll.2\) . (ii) the matrix C is singular if r < n ~ rank(yl), (Hi) otherwise it is nonsingular with 
probability 1, and (iv) the value (Jn{C) is expected to have at most order (Jp{A) if the ratio a/\\A\\2 
is neither large nor small, e.g., if < a/\\A\\2 < 100. 

5.1 Proof of Theorem 15.11 (parts (i)— (iv), case r = n — p) 

Part (i) is an immediate observation, which implies (|1.2p by virtue of (13. ip and Theorem 13.21 for 
A — U , A — V and h = n. Furthermore we readily prove parts (ii) and (iii) of the theorem 
(on singularity and nonsingularity) by applying the techniques of Section [3^ To prove part (iv), 
that is to bound the ratio ||C-i||/||A+||, we factorize the matrix C, which involves a number of 
technicalities. In this subsection we only handle the case where r — n ~ p. 

Theorem 5.2. Suppose A,C,S,T G M"^" and U,V E M"^*" for two positive integers r and n, 
r < n, A — SYiT'^ is full SVD of the matrix A (cf. 112.5]) ). S and T are square orthogonal matrices, 
S = diag((Tj)"^]^, the matrix C ~ A + UV'^ is nonsingular, and so p = rank(yl) = n — r and > 0. 
Write 
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where Ur and Vr are r x r matrices. Then 

(a) Ru^Ry = whereas Ru dia.g{Op,p, Ir)Rv = S'^UV^T, and so 

C ^ SRuDR^T'^, D = Ys + diag(Op,p, 4) = diag(dj)^^i (5.2) 

where dj = aj for j = 1, . . . , p, dj = aj + 1 for j — p + 1, . . . ,n. 

Furthermore suppose that the matrix A has been normalized so that ||A|| = 1 and that the r X r 
matrices Ur and Vr are nonsingular, which holds with probability 1 where U and V are Gaussain 
random matrices (cf. Section ] 3. 2\) . Write 

p=\\Ru'\\ WRv'W and/, = max{l,||t/-i||} niax{l, (5.3) 

Then 

(b) the matrix C is nonsingular, 

(c) 1< \\Ru\\<<yp{A)/a^{C)<p, 

{d)p<{l + \\U\\){l + \\V\\)fr, 

(e) 1 < a,(A)/a„(C) < (1 + \\U\\){1 + ||y||)/,. 

Proof. Parts (a) and (b) are readily verified. 

(c) Combine the equations S-^ = 5"^, = T and 1^ and obtain C"! = T Ry^ D''^ R^^ S'^ 
or equivalently D'^ = RvT'^C^SRu. It follows that \\C-^\\ = \\Ry^ D^^R-{j^\\ and \\D-'^\\ = 
\\RvT^C-^SRu\\. Apply bound dM]), substitute \\S\\ = \\S'^\\ = \\T\\ = ||T'^|| = 1 and obtain 
||C-i|| < lliiy^ll \\D-^\\ and \\D-^\\ < ||C-i|| \\Ru\\- Substitute the equations (lOl) . 
\\D-^\\ = l/(Jp{A) (implied by the equations ||A|| = 1 and dO])) and \\C-'^\\ = l/cr„(C) and the 
bounds > 1 and > 1 and obtain that 1 < ap{A)/an{C) < p. 

(d) Observe that = (^^ (^^ ^^_X Ry' ^ (^S (o i) ' 1 1^1 1 ^ I l^^l I 



and \\V\ \ < \\V\\. Then combine these relationships. 

(e) Combine the bounds of parts (c) and (d). □ 

We have < ||^ a''\c\ ' ^^'^ parts (d) and (e) together bound the ratio in terms of 
the norms \\U\\, \\V\\, \\U;-^\\ and llVJ-^W as follows, 



^<il + \\U\\ \\V\\)il + ||[/||)(1 + \\V\\) max{l, \\U-'\\} max{l, (5.4) 



and in particular 

k{A) 



< (1 + ||C/|n(l + \\U\\fmax{l,\\U-^\\^} if C/ = K (5.5) 



Let us estimate the norms \\U\\, \\V\\, \\Uj. ^\\ and \\Vj. ^\\ where U and V are Gaussian random 
matrices. 



Theorem 5.3. Suppose that A, U, V, Ur and Vr denote the five matrices of Theorem \ 5.2\ where 
U,Ve g^p^J. Then max{i^,/||y-i||(2/),F,/||^-i||(y)} < 2.35 y^/a for y > 0. 

Proof. Lemma [4.31 implies that S^U,T^V G Gp^^ by virtue of Lemma [4.31 because S and T are 
square orthogonal matrices. Hence Ur,Vr E Gp^J. Apply Theorem 13.11 for A — Ur and A = Vr 
where in both cases m = n = r. □ 

Combine Theorem 15.31 with relationships (|5.3I) - (|5.5|) and obtain part (iv) of Theorem 15. II in the 
case where r = n — p and ||A|| = 1. Relax the normalization assumption by scaling the matrix A. 

Remark 5.1. So far our proof of Theorem 15. il remains valid where U = V E Gai^j ^'at not so in 
the case where r > n ~ p, covered in the next subsection. We can make similar comments on the 
extension to the proofs of Theorems \l.l \ and \5.6\ in Section\5^ 
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5.2 Proof of Theorem 15.11 (part (iv), case r > n — p) 

Next we extend the proof of part (iv) to the case where s = r + p—n > 0. The extension is immediate 
where the matrix A is nonnegative definite and U = V but is more involved and less transparent 
in the general case. Write U = {U'^'l \ Us) and V = {V'^"^ \ Vs) where U^'^V^'^ G ^oT"^'"' 
and Us,Vs G Gq^J ■ As we have proved already, the n x n matrices 

C = A + UV'^ = C''*' + UgV^ are nonsingular with probability 1 and are expected to have norms 
of order for reasonably bounded values a (cf. (|1.2p ). whereas the matrix C*^*-' is expected to be 
well conditioned, that is the ratio Ks = HC'^-* ||/fT„(C*^*'') > 1 is not expected to be large. To simplify 
the notation, scale the matrices A, C, U and V to have cr = 1, expecting that the scaling factor and 
the new value of the norm HC-^^H is neither large nor small (cf. (jS.ip '). 

Let C^"') = SY.T^ be SVD, where S = diag(crj(C(")))J'^i, premultiply the equation C = C^"'' + 

UsV^ by S^, postniultiply it by T, write C = S^CT, U = S^Us, and = V^T, and obtain 
C = T, + UV'^ where aj{C) = crj(C) for all j by virtue of Lemma and U ,V G Q'^^ by virtue of 
Lemma |4.3[ because S and T are square orthogonal matrices. Furthermore write 5ij = for i ^ 
5ii — 1, Ci — C^e^, cJ — C^^Bj, for i, j — 1, . . . ,n, and so 

cfcj = efej = for i,j = l,...,n, (5.6) 

IIS^II = l/'^J^J ^ (5.7) 
Indeed for every j the unit vector Cj = cj/\\cj\\ — (cij)^^^ is unique because the vector cJ is 
orthogonal to all vectors Ci for i ^ j (cf. (|5.6p ). Combine the latter equation Cj — c~/||c~|| with 
cjcj ~ 1 and deduce equation ()5.7p (cf. |SST06| Proof of Lemma 3.2]). 

Next we estimate the norm \ \cj\\ from above or equivalent ly the value cJcj from below for any 
fixed integer j, I < j < n. We write uj — eJU £ Gq^^ and tj = V^Cj and are going to deduce a 
probabilistic upper bound on the norm ||tj|| as long as we have a probabilistic upper bound on the 
norm ||cj||. Represent the value cJcj as eJCcj — eJScj + eJUV^Cj and infer that 



=J2, =a,(C(^')%+Sjt,. (5.8) 
) = Uj, rec 

Write p = F^T^ . (y) and infer that 



Recall Lemma [3.21 for a = 1 and b = u,, recall Remark |3.1[ and obtain that F„Tp (y) < \ -tt-t:- 

' c.Cj\a/ — y IT ||tj|| 



l|t,ll<J--. (5.9) 

V TT p 

Next deduce upper estimates for the values \cij\ for all i, at first for i — j. Obtain from equation 
that Cjj — (cJcj — uJtj)/(Tj (C*^*^). Substitute cJcj < y and (|5.9[) and obtain \cjj\ < {y + 

||t,||)M(C(^)) <iy+ \\u,\\ ^y-)/a,{Ci^y), and so 



ujjl < (l + y-^)2/Ks, (5.10) 

where the value Kg — 1/ct„(C(''^) is not expected to be large and where the cdf -F||a ||(2/) — Xos,n{y) 
is bounded in Definition 13.31 

Next let i ^ j, recall that cf cJ — (cf. (15. 6p ). substitute cJ = eJC = e[l] + ufV"^ and obtain 

cfcJ = efT.cJ +ufV'^cJ = cr,(C("))% + uf = 0. Hence |c„| < |uftj |/cr,(C(")). Substitute 
()5.9| and obtain 

\c^J\ < \ -^-^yKs for ah z ^ J. (5.11) 

V TT p 

Combine equations (|5.10|) and (|5.1ip and obtain that ||cj|p = — IV^^I where 7 ~ 

1 + 2^1^ + I|u<IIVp^ and ^||a.||(2/) = Xo,i,"(2/) for aU i. Recall that c^- is a unit 
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vector, and consequently 7y^Kj > 1, which probabilistically bounds the ratio y/p from below, thus 
implying the desired upper bounds on the norms ||c~|| — ||C~^ej|| for all j and consequently 
= IIC*"^!! < This completes our proof of part (iv) of Theorem 15.11 



5.3 Extension of Theorem 15.11 to the case of rectangular matrices 

Clearly part (i) of Theorem 15 . 1 1 holds for any pair of m and n. By extending the concept of singularity 
of a matrix to its rank deficiency, we readily extend parts (ii) and (iii) . Next we employ Fact 12.31 
and Lemma 14.31 to extend our upper bound on (Tp(A)/cr„(C) to the case where m ^ n. 

Theorem 5.4. Suppose A e K™^", U E Gq^^^ , and V £ Gq^^ for three positive integers m, n and 
r, the matrix C = A + UV^ has full rank I — min{m, n}, and I — r < p = rank(A). Keep equation 
i5.1]) but write 

Im^nS-U = (^J , /,„,„r-F = (^J (5.12) 

for Ig^fi of 112. 1\) where Ur and Vr still denote r x r matrices. Keep the other assumptions of parts 
(a)-(e) of Theorem \5.2[ Then the upper bound of part (e) of Theorem \5.2\ can be extended, that is, 
ap{A)/ai{C) < (1 + \\U\\){1 + \\V\\)fr where fr - maxjl, \\U~^\\} max{l, 11^-^11} as in (EM) and 
where Ur,Vr E Gq^J . 

Proof Let A = Sa^aTI be SVD of ([231). Write d = I^,nSlCTAllm, U = lm,nSlU , V = 
In,mTlV, A = Irn,nSlATAllm, and SO A ^ (fTj(A))j^i and C = A + UV^. Apply Theorem O 
to the Z X Z matrices A and C and obtain that (Tp(A)/cr„(C) < (1 + ||J/||)(1 + ||V^||)/r- Complete 
the proof of Theorem 15.41 by combining this bound with the relationships a-p{A) — ap{A), cri{C) < 
ai{SlCTA) = ai{C), \\U\\ = ai(C/) > a^{U) = \\U\\, and \\V\\ = a^{V) > a,{V) = \\V\\. Here the 
equations hold by virtue of Lemma |4.1[ because the matrices Sa and T4 are square and orthogonal. 
The inequalities hold by virtue of Fact 12. 3[ because C, U, and V are submatrices of the matrices 
S'^CTa, S^U, and V^Ta, respectively. □ 



Combine Theorems 15.31 and 15.41 to yield the following result. 

Theorem 5.5. Assume that A G M'"^^", U e G^^'^ , V E Gq^'' , C = A + UV^, and I = min{m,n}. 
Then the matrix C is rank deficient if r < I — rank(A). Otherwise with probability 1 the matrices 
U and V have full rank and the bound ap{A)/ai{C) < (1 + ||i7||)(l + ||V^||)/r of Theorem \5.4\ holds, 
the norms \\U\\ and \\V\\ satisfy the randomized bounds of Theorem \3.S\ (for A = U and A = V), 
and the values \\U~^\\ and \ \V~^\\ satisfy the randomized bounds of Theorem \5.3\ 

Corollary 5.1. Theorem 15. il can be extended to the case of matrices A E M™^", U E G^^^ arid 
V E Gq^^ for any pair of positive integers {to, n} and I = min{TO, n}, that is the matrix C = A+UV'^ 
is rank deficient if r < I ~ rank(A), whereas for r > I — rank(A) it has full rank with probability 
1 and is expected to have condition number of at most order \\A\\/ai-r{A) if the ratio cr/||A||2 is 
neither large nor small, e.g., if < a/\\A\\2 < 100. Consequently the matrix C is expected to be 
nonsingular and well conditioned if the matrix A has numerical rank at least I — r. 



5.4 Extension to and from Theorem 11.11 

To extend Theorem 15.11 to Theorem 11.11 truncate the SVD of the matrix A having numerical rank 
p < n hy setting to all its singular values, except for the p largest ones. This produces a well 
conditioned matrix A — E oi rank p where \\E\\ — (Tp+i(A) and the ratio ||£'||/||A|| is small because 
the matrix A has numerical rank p. Now to obtain Theorem 11.11 combine Theorem 15.11 (applied to 
the matrix A~ E rather than A) with Theorem and the simple bounds ||A|| - \ \E\\ < \\A\\ < 
1 1^1 1 + ||£'|| and ||C|| — ||£^|| < ||C|| < ||C|| + \ \E\ \ and observe that an upper bound of at most order 
I |A| |/(Tp(yl) on k{A) implies that the matrix A having numerical rank p is well conditioned. 
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We can apply Theorem [52] instead ol TlLeorem l2.3l and similarly extend the results of the previous 
subsection, to rectangular matrices A. To yield stronger estimates, however, one should avoid using 
Theorem 12.21 and instead extend Theorem 11.11 to the case of rectangular input by applying our 
techniques of the proof of Theorem 15.41 Here is the resulting extension of Theorem ll.il 

Theorem 5.6. Suppose A is a real m x n matrix having a numerical rank p (that is the ratio 
<jp^i{A)/\\A\\ is small, hut the ratio ap{A)/\\A\\ is not small), the (m + n)r Gaussian random 
entries of two matrices U G Gq^^^ and V G ^oa'' '^^^ independent of each other, C = A + UV^ , 
< r < I and < p < I = min{r7i,n}. Then (i) hounds of hi. 2) hold, (ii) the matrix C is singular 
or ill conditioned if r < I — p; (Hi) otherwise it is nonsingular with prohahility 1 and (iv) is expected 
to he well conditioned if the ratio (t/\\A\\2 is neither large nor small, e.g., if < (j/\\A\\2 < 100. 

The next corollary slightly generalizes Theorem l5.6l to match the augmentation map of Theorem 
16.21 of the next section. 

Corollary 5.2. Suppose that A, U and V denote the same matrices as in Theorem \5.6l I still 
denotes the integer min{m, n}, but C ^ A + UAIV^ for a normalized nonsingular r x r matrix M , 
\\M\ \ = 1. Then Theorem \5.6\ can be extended as follows: the matrix C is rank deficient if r < I — p; 
otherwise it has full rank with probability 1 and is expected to have condition number of order k{M) 
if the ratio a/\\A\\2 is neither large nor small, e.g., if, say < a'/\\A\\2 < 100. 

Proof. Let M ^ Sm^mTIi be SVD and rewrite C = A + UMV^ as C = A + UV'^ where U = USm 
and V = Y^mTIiV. Note that U e GJJ^^'' and TmV G Gq^'' by virtue of LemmaO Now reapply 
the proofs of this section replacing U hy U and V by V. All the proofs are readily extended except 
for the estimates for the norm V^^, which grow by at most a factor k{T,m) ~ k{M) by virtue of 
Lemma 14.21 □ 

Remark 5.2. How large is the class ofmxn matrices having a numerical rank p? We characterize 
it indirectly, by noting that by virtue of Fact \2.2\ the nearby matrices of rank p form a variety of 
dimension {m + n — p)p, which increases as p increases. 

5.5 Dual additive preconditioning 

For an r7i x n matrix A of full rank we extend (j2.9p and (I2.10p to define the dual additive preprocessing 

A+ =^ C+ = A+ + U^V:^ . (5.13) 

Our analysis implies that the value k{C^) (equal to k(C_)) is expected to have at most order 
a-q^i{A)/<Ti{A) provided I — min{TO,n}, U- G i'' ^- ^ Gq^i'^, and the norm ||A+|| is neither 
large nor small. The randomized algorithm of |D83] is expected to estimate the norm ||A+|| at a 
low computational cost. We can work with the (to + 1) x (n + 1) matrix A — diag(^, e) instead of 
the matrix A and choose a sufficiently small positive scalar e such that ||^^|| = 1/e. Then we can 
scale the matrix A to obtain that ||(^/e)+|| = 1. 

5.6 Can we weaken randomness? 

Would Theorems 11.11 15.11 15.61 and other results of this section and of the next one still hold if we 
weaken randomness of the matrices U and V by allowing them to be sparse and structured, to share 
some or all their entries, or generally to be defined by a smaller number of independent parameters, 
possibly under other probability distributions rather than Gaussian? We have some progress with 
our analytical study in this direction (see Sections 13.41 and 13.51 and Remark 15. ip , but empirically 
all the presented randomized techniques remain as efficient under very weak randomization in the 
above sense (cf. Tables [TU31 MM \WW[ and [TUTB|) . 
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6 Randomized augmentation 

6.1 Augmentation and an extension of the SMW formula 

The solution of a nonsingular linear system of n equations, Ay — b can be readily recovered from 

a null vector ^ j/^^ matrix K — (/3b | A) for a nonzero scalar (3. If the matrix A has 

numerical nullity 1 and if the ratio ||A||/||/3b|| is neither large nor small, then the matrix K is well 
conditioned for the average vector b |PQa[ Section 13.1]. The above map A if is a special case 
of more general augmentation 

W V^^ 



which we study next, beginning with the following extension of the SMW formula. 



Theorem 6.1. Suppose equation i6.1\) holds, m — n and the matrices A, W and K are nonsingular. 
Write S = A -\- UW~^V^ and R = I — V"^ S~^UW^^ . Then the matrix S is nonsingular, is 
the trailing (southwestern) nx n block of K^^, and 

A-^ =S-^ + S-^UW-^R-W^S-\ (6.2) 

Proof. Apply the SMW formula of Theorem El for C replaced by S, U by UW'^ and Ghy R. □ 

6.2 Links to additive preprocessing and condition estimates 

In contrast to the scaled randomized symmetric additive preprocessing A =^ C — A + VV'^ (cf. 
(I5.5P and jW07j ). the map A K = ^ ) cannot decrease the condition number k{A) if 



V A ^ 

if is a symmetric and positive definite matrix; this follows from the Interlacing Property of the 
eigenvalues of K |GL961 Theorem 8.6.3]. Nonetheless the following simple theorem links additive 
preprocessing A =J> C = A + U MV^ to the augmentation A =^ K for K of (|6.ip and later we 
extend Theorem 15.61 to the augmentation as well. 



Theorem 6.2. Suppose A e M™^", W £ W"^"^ , the matrix W is nonsingular, I ~ min{m, n}, a 

matrix K in M(™+'-)x(n+r) defined by fOP, andC ^ A + UW-^V'^ . Then we have 

if = C/diag(C,i,)\>diag(H^,i„) (6.3) 
for U = f^J^'"^ rrw-i)' ^ ^ f^n,r the matrix C has full rank if and only if the 



matrix K has full rank, and so both matrices are rank deficient for r < I. Furthermore U^^ = 
],V-^^ ^ . For m = n and nonsingular matrices C and K , we have 

C-^ = {la I 0„,.)Fdiag(VK,i„)if-iC/(i„ I On,rV and if " ^ = diag( VK" ^ , i„ ) \>- ^ diag(C- ^ , i,) C/^i . 

Corollary 6.1. (Cf. lPQa\ Remark 10.1 and Corollary 11.1].) Define three integers m, n, and I and 
three matrices A, if, and W as in Theorem \6.'A write h = max{m, n}, and suppose that U G Qq^^^ 
and V G 5o,r- Then (i) \\A\\ < \\K\\ < \\A\\ + \\U\\ + \\V\\ + \\W\\, and (ii) the matrix K is rank 
deficient if r < I — rank(A) but has full rank with probability 1 otherwise. (Hi) Furthermore suppose 
r > I ~ rank(A) and the ratio a/\\A\\2 is neither large nor small, e.g., say < cr/||j4||2 < 100. 
Then the matrix if is expected to have condition number of order (1 + 2h)'^ k{W)'^ / ai-r{A) , that is 
(iv) of order (1 + 2haf/ai-r{A) provided W G G^Y . 

Proof. Part (i) is verified immediately. Next estimate the rank of the matrix C as in parts (ii) and (iii) 
of Theorem 11.11 and apply equation (|6.3p to extend the estimates to the rank of the matrix if. This 
proves part (ii). Equation (|6.3p and CoroUarv 14. II together imply that K(if) < h{U)k{V)k{C)k{W). 
(We can apply Corollarv 14.11 because the matrices U, V and W are nonsingular.) We have k{U) < 
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(1 + ||t/||)2 and k{V) < (1 + and we can expect that max{||J7||, ||y||} < 1 + 2ha for 

h = max{TO,n} by virtue of Theorem 13.21 Now apply Corollary 15.21 for M = W^^ to bound k{C) 
and recall that k{W~^) = k{W). Combining these estimates proves part (iii). To extend part (iii) 
to part (iv) note that a matrix W in ^^q^'^ is nonsingular with probability 1 and is expected to be 
well conditioned (see Sections 13.21 and I3.3p . □ 

6.3 Direct condition estimation: Gaussian random leading blocks 

To obtain sharper bounds and better insight into the subject, let us estimate the condition number 
k{K) directly, without reducing this task to additive preprocessing. Some initial study of randomized 
augmentation in this direction can be found in |PQa| . In particular the results of |PQa[ Corollary 
11.1] are similar to Theorem 16.31 but |PQa| only provides a pointer to the idea of a proof. Part (i) 
of Corollary |6T] is extended immediately, and next we extend the other parts. 



Theorem 6.3. Suppose a real normalized mx n matrix A has a rank p < n, U E G™i'^, V E Gqi^, 
W e Goj'', K m M(™+^)x(n+9) defined by JO]) . I = min{m, n}, r = mm{m - q,n- s} > 0. Then 

(i) the matrix K is rank deficient if p <r, 

(ii) otherwise the matrix has full rank V = min{m + s, n + q\ with probability 1 and 

(iii) is expected to have the condition number k{K) of order at most l/(7r{A). 

Thus we can expect that the matrix K has full rank and is well conditioned if p > r. 

Proof. Assume that the entries of the matrices U, V, and W are indeterminates. Then clearly the 
matrix {~U \ A) has full rank, that is has m linearly independent rows, if and only ii p + q > m. 

Likewise the matrix . has full rank, that is has n linearly independent columns, if and only if 



^ A ^ 

p + s > n. The transition from these matrices to the matrix K increases the numbers of linearly 
independent rows by s and columns by q. Summarizing we obtain parts (i) and (ii) provided that 
the entries of the matrices U, V , and W are indeterminates. Relax this assumption by applying 
Lemma 13.11 

Next assume that p + s + q > I' = rank(iir) and estimate the condition number k{K) = 
\\K'\\/ai'{K). By virtue of Theorem 13.21 we can expect that the norms of the matrices U, V, 
and W are in 0(1), that is do not exceed a fixed constant, and so \\K\\ = 0{1) as well because 
ll^i^ll < IIC^II + ll^ll + IIW'II + 1 1^1 1 and \\A\\ = 1. It remains to estimate the value ai'{K) from 
below. We can assume that I' = m + s < n + q, and so r = m — q, for otherwise we can estimate 
k{K^) = k(K). 

At first let s = 0. Then V = rank{K) ^m<n + q, K = {-U \ A), and V and W are empty 
matrices. Reuse and extend the idea of Section [Ol that is reduce the original task to the case of an 
mx m submatrix K of the matrix if, which is nonsingular with probability 1 and for which we have 
am{K) = am{K)] then estimate the value am{K) as the reciprocal 1/||_R'~^||. Namely assume the 
SVD A = Sa^aTJ of ^ and write K' = S^K diag{Ig,TA) = [U' \ S^). Note that Sa, Ta and 
diag(TA,/g) are square orthogonal matrices and infer that aii{K) = <Jm{K') by virtue of Lemma 
14.11 whereas U' = S'^U € Q^i'^ by virtue of Lemma l473l The mx [n + q) matrix K' has the mx m 

leading submatrix K = J^m-q \ .^^j^^j.^ jj^ ^ g^™ 9)x?^ jj^ ^ ^?o^^' rank(A') = rank(if) = m, 

Y^ra-q = diag(crj (A) )™;^'^ , and so rank(Em_g) ^m - q and GviK) = CTm(if) > a„i(i?) = l/||A'"i||. 
We have 

Therefore \\K-^\\ < W^^n-qWi^ + WUoWWi^W where \\Y~^__g\\ = l/arn-q{A). Theorems O and 1321 
together bound the norms ||J7r^|| and ||?7o||, implying that the value l/ai'{K) — is expected 

to have at most order l/am-q{A). 
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Now let s > 0. Then again we reduce our task to the case of a square matrix K ^M.^ (where 
I' — m + s) such that <Jj{K) > (Jj{K) for all j and then estimate the value ai'{K) as the reciprocal 

l/||-fi'^^||. Namely represent the matrix K as ( p ) where B = {W \ V'^), F ~ {—U \ A), and 



the value ||i^^|| has at most order l/am-q{A), as we proved above. Let F = Sf^fT^ be SVD 

and write K" = dmg{L, SJ)KTf = n ) ^^ere Bo G ^0'^', G g„«x/"+«-™), 

Yjf = diag((Tj(F))"^]^, rank(_ftr") = rank(_ft') — V , and so the matrix Yjf is nonsingular and — 
||F+||. We have aj{K") = o-j{K) for all j because the matrices diag(/s, 5*1^) and are square and 
orthogonal. 

Delete the last n + q — m — s columns of the matrix K" and obtain the I' x I' submatrix 
K = [ 'r^" ^ ) . We have <tii{K) = aii{K") > ai'{K). The Gaussian random s x s matrix 



^^F Orn^n+q—va^ 

Bi is nonsingular with probability 1. We assume that it is nonsingular, and then so is the matrix 
K as well, and consequently <tii{K) = ai'{K) = l/\\K^^\\. Observe that 



Therefore WK-^l < HBj^^lKl + ||Bo||)||S^^||. Theorems O and O together bound the norms 
IIB^^II and ||So||- To complete the proof of the theorem recall that — and that the 

norm Hi^"*"!! is expected to have at most order l/(T„i-q{A). □ 

Compared to Corollarv 16.11 Theorem 16.31 allows rectangular matrices W in Q'^^i. Combined 
with Theorem 16.21 it implies Corollarv 15.21 restricted to the case where S Gq^'^ ■ In the next 

subsection we extend Theorem 16.31 by relaxing this restriction provided that s — q and allowing any 
well conditioned block W with the norm not exceeding 1. 

6.4 Direct condition estimates: well conditioned leading blocks 



Next we outline a direct proof of Corollary 16.11 allowing any scaled well conditioned square leading 
blocks W. The supporting estimates are stronger than the ones deduced via combining Corollary [521 
and Theorem 16. 2 1 We begin with providing all details in the case where m — n and r = I — rank(A). 
In this case we allow ill conditioned blocks W . 

Theorem 6.4. Suppose n and r are two positive integers, a real normalized n x n matrix A has a 
rank p < n, U,V £ Gl}^'' , W € ^''^^ < 1, and K denotes the matrix of {Sjp. Then (i) the 

matrix K is singular or ill conditioned if r < n — p. Otherwise it is nonsingular with probability 1. 
Furthermore (ii) if r ~ n ~ p, then the condition number k(K) is expected to have at most order 
\\A\\/cj,,MA). 

Proof. We proceed similarly to the proof of Theorem 16.31 Suppose A = Sa'^aTJ is the SVD of 
and write K = diag(/r, S'J)X diag(/,., Ta). Then 



K 



W 

-U 



A 



where o-j{K) = crj{K) for all j and U,V e Gq^"" ■ Furthermore write Ep = diag(crj (yl))^^^, S 

diag(Ep,0„_p,„_p), U = (^""^ and V = where Uo,Vo G G(;^i" and Ui,Vi G ^Sr"^""' and 
obtain 

fw V,^ \ 

= f/o Sp Op,„_p . (6.4) 

yt/l Op,n— p ^ix—p^n—p j 
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Now we can readily verify the claims about rank(ii'). It remains to estimate the condition number 
k{K) — \\K\\ \\K^-^\\ = \\K\\ \\K~^\\ provided that r — n~ p and the matrix K is nonsingular. 

To bound the norm \\K\\, note that K = ^''A + f j^^'"' \ '^%A , recall that 

\\W\\ < \\A\\ = 1, apply bound (HJ]) and obtain 

||X||<l + max{||;7||,||F||}. 

By virtue of randomized bounds of Theorem l3.2l we expect to have the norms \ \U\ \ and \ \V\ \ in 0(1), 
that is bounded by a constant, and then \\K\ \ is in 0(1) as well. 

We conclude the proof by estimating the norm ||_^|| = \ \K\\. We readily verify that 



Apply bound (|2.7p and deduce that 



where iVi =max{||Ff ^11, ||I];;l,||,||C/ri||}, iVa = max{||V^f ^Fo^E;;iJ|, ||I];^l,t/oC/ri||} and iVg = 
||Ff^(W^ + CSn'rf^o)f/r'||- Recall that \\W\\ < 1, = l/a„_,(A), = \\Vo\\, and 

\\V-^\\ = ^11 and deduce that 

iVi =max{||Ffi||,||;7fi||,l/c7„_,(A)}, 
N2<raa^{\\Vf'\\ ||Fo|U|C/o|| ||C/r^||}/a„_,(A), 

N3<\\Vr'\\ \\Ur'\Kl + \\Vo\\\\U„\\/a,,^r(.A)). 

Apply Theorems O and [32] to estimate the norms \\U\\, \\V\\, \\Vo\\, \\Uo\\, \\U^^\\ and ||Vf ^||. 
Combine all the above bounds to estimate the norm ||_fi'~^||. □ 



Next we extend Theorem 16.41 to the case where r > n — p and where we require that the leading 
block W be normalized, square and well conditioned. 

Theorem 6.5. Theorem \6.4\ still holds where r > n — p and the leading block W of the matrix K is 
normalized, square and well conditioned. 

Proof. Clearly parts (i) and (ii) of Theorem l6.4l are extended, and moreover we immediately deduce 
that for r > n — p the matrix K is nonsingular with probability 1 and that its norm \ \K\ \ is expected 
to be in 0(1). It remains to estimate the norm ||_fi'^-'^||. Assume SVDs W = Sw'^wT^, and 

A = Sa^aTa. write K = diag(S'^,5j)is:diag(riv,r4) and observe that K = {^^ where 

U ,V € Gq\^ by virtue of Lemma [4.31 because the matrices Sw^ TV, Sa, and Ta are square and 
orthogonal. Now complete the proof by extending the techniques used in Section 15.21 in the proof of 
part (iv) of Theorem II. 1[ in the case where r > n ~ p. □ 



Corollary 6.2. Keep the assumptions of Theorem \6.5\ except that now let the matrix A have 
numerical rank p, rather than rank p. Then Theorem \6.5\ is extended and furthermore the matrix K 
is expected to be well conditioned. 

Proof. We immediately extend parts (i) and (ii) of Theorem 16.41 To extend part (iii) truncate the 
SVD of the matrix A by setting to all its singular values except for the p largest ones to obtain 

matrices A — E ^ A of rank p and K = ( ^ ^ ^ j such that \\K — K\\ < \\E\\ = a-p+i(A). 
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The value (Tp+i(yl) is small because the matrix A has numerical rank p, whereas the norm ||_ftr~'^|| 
is not expected to be large by virtue of Theorem 16.51 Therefore we can expect that ||i?iir~^|| < 1/3, 
and consequently that ||ivr~^|| < 1.5||iir~^|| by virtue of Theorem 12.31 Consequently the condition 
number k{K) is also expected to be of at most order ||A||/(Ti_r(A) — l/iT„-r(^) as in part (iii) of 
Theorem 16.51 but in the corollary this means that the matrix is well conditioned because we assume 
that the matrix A has numerical rank p, and so the ratio ||A||/CTp(A) is not large. □ 

The corollary implies that the matrix K is nonsingular with probability 1 and is expected to be 
well conditioned in the case where the matrix A has a numerical rank at least n — r. Let us extend 
our analysis to the case of rectangular matrices A G R™^" . 

Theorem 6.6. Suppose m, n, and r are three positive integers, I — min{m,n} , A G R™^", the 
matrix A has a numerical rank p <l-r, U G G^^'' , V G G^l^^ , W G R''^^ \\W\\ < 1, K is the 
(m + r) X (n + r) matrix defined by equation 116. Then (i) this matrix is rank deficient or ill 
conditioned if p < I — r, but otherwise has full rank with probability 1 and (ii) is expected to be well 
conditioned. 

Proof. Part (i) is readily verified. Let us prove part (ii). Suppose A = Sa^aT'^ is the SVD of 
(12311 and write K' = diag(/^, 5j)Kdiag(/r, Ta) = f ^ where U = -S'JC/ and = V^Ta- 



U IIa, 

Observe that k{K) = k(X'), U G V G ^^ and k{K) = k{K') because 5"^ and Ta are 

square orthogonal matrices. Define the leading I x I submatrix K' ~ /,„+r.n+r-^'-^,T+r m+r Ig.h 

of (|2.ip and observe that K' = ^ , [/ = {U2 \ Om-ri,n), U2 G ^0™ "''^^ if m > / = n, whereas 

K' = {k' I V^), = (F/ I 0„-™,™), V2 G \in>l^m. Clearly ai{K') > ai{k') (cf. 

Fact EH) and ||i^'|| < \\k'\\ + ||i^|| for F ^ U2 or F = V2. In both cases F G and so we 

can expect that \\k'\\ = 0(||-^'||) because ||^'|| > = \\A\\ = 1. CoroUarv lOl implies that 

with probability 1 the {I + r) x (I + r) matrix K' is nonsingular, and then rank(i4r) = rank(if ) = 
rank(_ft'') — l + r^ implying that the matrix K has full rank. Furthermore Corollarv 16.21 implies that 
the matrix K' is expected to be well conditioned. It remains to extend this property to the matrix 
K. Recall that k{K') = \\k'\\/ai{k') and niK) = k{K') = \\K'\\/ai{K') and combine the above 
equations with the bounds cri{k') > cri{K') and \\k'\\ = 0{\\K'\\), deduced earlier. □ 



6.5 A randomized Toeplitz solver 

Let us apply Theorem 12.61 to support randomized augmentation for solving a nonsingular Toeplitz 
linear system Ty = b of n equations provided the matrix T has numerical nullity 1. 

To compute the vector y = r~^b, we first embed the matrix T into a Toeplitz (n + 1) x (n + 1) 

matrix K = i ^ rp j. We write w — ejTei and fill the vectors f — and v = with 

appropriate entries of the matrix T except for the two coordinates /„ and w„, which we choose at 
random and then scale to have the ratios and neither large nor small. 

Part (b) of Theorem [2]6] expresses the inverse T~^ via the vectors v — K^^ei and w = if~^e„+i. 

In view of Section 13.21 and Appendix A, this policy is likely to produce a nonsingular matrix K 
whose inverse is likely to have a nonzero entry efK~^ei. In good accordance with these formal 
results our tests have always produced nonsingular and well conditioned matrices K such that 
ej;K~^ei ^ 0. 

To summarize, we reduce the solution of a nonsingular ill conditioned Toeplitz linear system 
Ty = b to computing highly accurate solutions of two linear systems Kyi = ei and /Cz = e„_|_i, 
both expected to be well conditioned. High accuracy shall counter the magnification of the input 
and rounding errors, expected in the case of ill conditioned input. 
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In the important special case where a Toephtz matrix T is real symmetric, we choose real scalars 
w and /„ = Vn to yield a real symmetric matrix K = { ^ j . In this case J„_|_i/-C^^ J,j+i = K^^ ^ 

and so K^^Gn+i = Jn+iK^^ei because Jn+i^n+i = ei. Thus we only need to solve a single linear 
system with the matrix K. For the transition back to the solution of the original problem, we 
can employ expression (j6.2p or Theorem 12.61 Hereafter we refer to the resulting algorithm for the 
linear system Ty = b as Algorithm 6.1. In Section [10.61 we test this algorithm for solving an ill 
conditioned real symmetric Toeplitz linear system. 

One can readily extend the approach of this section to the case of Toeplitz-like, Hankel and 
Hankel-like inputs and to augmenting the input matrix with r rows and r columns for r > 1. 



7 Low-rank approximation, approximation of singular spaces, 
and computation of numerical rank 

7.1 Randomized low-rank approximation: an outline and an extension to 
approximation by structured matrices 

Our next theorem expresses a rank-p approximation to a matrix A through an approximate matrix 
basis for the left or right leading singular space Tp.^i or Sp,^- We can obtain such basis by computing 
the SVD of the matrix A or its rank- revealing factorization jGE96j , [HP 92] , |P00a| , but if the matrix 
A has a small numerical rank p and if we are given its reasonably small upper bound p+ , then with 
a probability near 1 we can compute such basis at a low cost from the product A^G for G G G^i''^ ■ 
Theorem 17.21 of Section 17.31 formally supports correctness of the respective randomized algorithm, 
but our tests support it consistently even where G G T^^^^^ (see Tables [inS] and [iniOl) , and 
we conjecture that the same is true for various other classes of sparse and structured matrices G 
defined by fewer random parameters. We specify a low-rank approximation algorithm in Section [7.41 
its amendments in Section 17.71 and some related randomized algorithms of independent interest for 
the approximation of leading and trailing singular spaces of an ill conditioned matrix in Sections 17.51 
and l7.6l By applying low-rank approximation algorithms to a displacement of a matrix W having a 
possibly unknown numerical displacement rank d, that is lying near some matrices with displacement 
rank rf, we can approximate the matrix W by one of these matrices and output d as by-product. In 
Section [8.31 we apply this observation to Newton's structured matrix inversion. 



7.2 The first basic theorem: low-rank approximation via the basis of a 
leading singular space 

The following theorem expresses a rank-q approximation (within an error norm fig+i {A)) to a matrix 
A through a matrix basis of its leading singular space T^.^ or §g.^ . 

Theorem 7.1. Suppose A is an my. n matrix, Sa^aT^ is its SVD of 112.5]} . q is a positive integer, 
q < min{m, n}, and T and S are matrix bases for the spaces Tq_A o,nd §q,A, respectively. Then 

\\A~AT{T^T)-'T^\\ = \\A~ S{S^S)-^S^A\\^aq+i{A). (7.1) 

For orthogonal matrices T and S we have T^T = S^S = Iq and 

\\A-ATT'^\\ = WA-SS'^AW =aq+iiA). (7.2) 

Proof. Let us first write P = Tq^AT'^A r ^ n — q and estimate the norm \\A — AP\\. We have 

AP = SA^ATTTq aT^a- Substitute TTTq a = f/'' ^ and obtain AP = Sa^a (^'A , whereas 



AJ^q,A-Lq A- ^UUSllLULe l^lq^A = yQ J OUiam = OA^A 

Sa^a [ ■ Therefore 

A-AP^ Sa^a (^^I") = Sa diag(0„ diag(a,)^^,+i) (^^r ) , 
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and so ^P|| — || diag((Tj)"^^^]^|| = fg+i because and T^i^^ are orthogonal matrices. Similarly 
deduce that \\A - Sq^AS^j^AW = aq+i{A). This proves ((7?T|) and ((7?^ for T ^ T^^a and S = Sq^A- 

Now let the matrices T and 5" have full rank, 7^(^) = Tq,A = T^{Tq,A), 7^(S') = Sq,jt = 
T^{Sq.A), and so T = Tq^AU and S" = ^^.^^y for two nonsingular matrices U and Conse- 
quently T{T^T)-^T^ = Tq,AU{U^T^^Tq,AU)-^U^T^^. Substitute T^A^q^A = Iq and deduce that 
{U'^T^/Tq^AU)-^ = (U^U)-' = U-'U-^. Therefore UiV^T^ ^Tq^AUy'^U^ = UU-'U-^U^ = 
Iq, and so T{T^T)-^T^ - Tq^AUiU^Tl/Tq^AUr^U^Tl^ = Similarly SiS^Sr^S"^ = 

implying the desired extension. □ 

7.3 The second basic theorem: a basis of a leading singular space via 
randomized products 

The following theorem supports randomized approximation of matrix bases for the leading singular 
spaces Tp,A and 'Sp^A of a matrix A having numerical rank p. 

Theorem 7.2. Suppose a matrix A e has a numerical rank p, H <E Gq^''^ and G G Q^^i'^* 

for p+ > p. Then the matrices T — A'^G and S = AH have full rank with probability 1 and we can 
expect that they have numerical rank p and that 

5 + A = Sp^aU and T + A' = Tp^AV (7.3) 

for two matrices A and A' having norms of order (Tp-|_i(A) and for two nonsingular matrices U and 
V having condition numbers of at most order \\A\\/ {ijp{A)^fp). 

Proof. The techniques of Section r3.2l and Theorem 14 . II support the claims about ranks and numerical 
ranks. It remains to deduce the former probabilistic relationship Sp.^/z+A — ^p.A of (|7.3p because 
we can apply it to A"^ to obtain the latter relationship p^A'^G+i:^' = ^p.A- 

Assume the SVD A ~ 5aS^TJ and note that \\^a — diag(Sp,^, 0„i_p,„_p)|| < CTp+i(A). Con- 
sequently \\A- Sa diag(Ep,A,0,m_p,„_p) TJ|| < ap+i{A) and AH ^ S - A, S = Sp^AU, ||A|| < 
(Tp+i(A) where U — Sp.^ii?, B = Tj^^H, and we can expect that the norm ||iJ|| is bounded 

from above and below by two positive constants (see Theorem 13.21) . This implies (17. 3p . It remains 
to estimate k{U). 

With probability 1 the p x p matrices B and U are nonsingular (see Section |321) ■ Furthermore 
we have \\U\\ < \\^p,a\\ \\B\\ where ||Sp,A|| = \\A\\ and \\B\\ < \\Tp^A\\ \\H\\ = \\H\\. So \\U\\ < 
\\A\\ \\H\\ = 0(||A||).' We also have \\U+\\ < W^'Wl \\B-^\\ for nonsingular matrix B. Observe that 

||E~^|| = l/ap(A), apply Theorem 14.11 where M = Tjj^, r = p and ar{M){M) — a — 1 and obtain 
that the norm ||_B~^|| is expected to have at most order 1/^/p. Summarizing we can expect that the 
norm ||C/"'"|| has at most order l/{ap{A)y/p). Consequently k{U) = ||J7|| ||f/"''|| has at most order 
\\A\\/iap{A)^). □ 



7.4 A prototype algorithm for low-rank approximation 

Theorems 17.11 and 17.21 imply correctness of the following prototype algorithm (cf. [HMTlll Section 
10.3]), where the input matrix has an unknown numerical rank and we know its upper bound. 

Proto- Algorithm 7.1. Rank-p approximation of a matrix. 

Input: A matrix A G jjmx" having an unknown numerical rank p, an integer p^ > p, and two 
tolerances r and r' of order CTp+i {A)/\ \A\\. (We can choose t at Stage 2 based on rank revealing 
factorization, can choose t' at Stage 3 based on the required output accuracy, and can adjust 
both tolerances if the algorithm fails to produce a satisfactory output.) 

Output: FAHURE (with a low probability) or an integer p and two matrices T G R"^'' and 
Ap G R™^", both having ranks at most p and such that \\Ap — ^|| < t'||A|| and T satisfies 
{73) /or ||A'||<r||A||. 
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Computations: 



1. Compute the n x matrix T' — A^G for G G Gq^i'^^ ■ 

2. Compute a rank revealing factorization of the matrix T' and choose the minimal integer 
s and an n X s matrix T such that ||T' — (T \ On,p_f_-s)\ \ < '''1 1^1 1- 

3. Compute the matrix Ag = AT{T'^T)~^T"^ . Output p = s, T and Ap and stop if \\Ap — 
^11 < t'||(A)||. Otherwise output FAILURE and stop. 

Assume that both tolerances r and r' have been chosen properly. Then by virtue of Theorem 1 7. 2 1 
we can expect that at Stage 2 s ^ p and T is an approximate matrix basis for the singular space Tp_^ 
(within an error norm of at most order (7p+i(A)). Consequently Stage 3 outputs FAILURE with a 
probability near 0, by virtue of Theorems 1 7. II (In the case of FAILURE we can reapply the algorithm 
for new values of random parameters or for the adjusted tolerance values r and r'.) At Stage 2 we 
have s < p because nrank(A'^G') < nrank(A) = p, whereas the bound \\Ap — A\\ < T'||(yl)|| at Stage 
3 implies that s > nrank(A). This certifies the outputs p, T, and Ap of the algorithm. 

We can similarly approximate the matrix A by a rank-p matrix S{S'^ S)~^ S'^ A, by first computing 
the matrix S' = AH for H £ Gq^''^ , then computing its rank revealing factorization, which is 
expected to define an approximate matrix basis S for the space Sp,A, and finally applying Theorem 
17.11 We have T^T = J„ and S'^S = Im where the matrices T and S are orthogonal, and then the 
expressions for rank-p approximation are simplified. 

Remark 7.1. One can weaken reliability of the output to simplify Stage 3 by testing whether 
\\K^iA - Ap)L\\ < t\\K\\ \\A\\ \\L\\ for matrices K € G^^i''' and L G Gq^' and for two small 
positive integers p' and p" , possibly for p' = p" = 1, instead of testing whether \\Ap — A\\ < t'||(j4)||. 
One can similarly simplify Stage 2. 

Remark 7.2. For p^ = p Stage 2 can be omitted because the matrix A^G is expected to be a 
desired approximate matrix basis by virtue of Theorem \7.2\ The increase of the dimension p^ beyond 
p (called oversampling in \HMTllf ) is relatively inexpensive if the bound p^ is small. ^HMTl 
suggests using small oversampling even if the numerical rank p is known, because we have 

Probability {\\A- ATT^\ \ < (1 + 9^ p+ min{m, n})ap+i{A)} > 1 - 3{p+ - p)P-P+ for p+ > p. 

Theorem \7.2\ however, bounds the norm \\A — ATT'^W strongly also for p = p^, in good accordance 
with the data of Tables \10.9\ and MO.HX 

7.5 Computation of nmbs and approximation of trailing singular spaces 

One can approximate trailing singular spaces S^.p a-nd p of a nonsingular ill conditioned matrix A 
having numerical rank p by applying Proto-Algorithm l7.1l to the matrix A~^ , because Tp^-i = §A,p 
and Sp.^-i — ^A,p- Next we achieve the same goal without inverting the matrix A, furthermore we 
cover the case of rectangular inputs. At first we compute a nmb of a rank deficient matrix A and 
then approximate the trailing singular space fA,r of an ill conditioned matrix A by truncating its 
SVD and applying Theorem 12.21 or 12.31 We can compute left nmbs and approximate left trailing 
singular spaces by applying the same algorithms to the matrix . 

Theorem 7.3. JPQIO, Theorem 3.1 and Corollary 3.1]. Suppose a matrix A G K™^" has rank 
p, U e M'"^^ V G M"'''', and the matrix C = A + UV^ has full rank n. Write B = C^^'^U . 
Then r > n — p, TZ{B) I) J^{A); moreover if r = n — p, then G'^^^U = nnib(A). Furthermore 
n{BX) = N{A) ifn{X) = N{AB). (Note that AB = Uilr-V^G-^U) for m ^ n.) 

Theorem 7.4. Suppose that A G M"^", U G M™^«, V G M"''^ W G IR"''^ K = 

iSink{W) ^q> mA{A), rank(X) = n + q, m > n. Write Y (0„,g | In)K^^^ (^u'^^ ' 
(a) Af{A) C n{Y) and i/rank(C/) = nu\{A), thenN{A) = n{Y), 



W 

-[/ A 
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(b) n{YZ) =Af{A) ifn{Z) =Af{AY), whereas 

(c) n{Z) =N{AY) ifn{YZ) =7V(A) and i/rank(y) = q. 

Proof. See [PQal Theorems 11.2 and 11.3]. □ 

Remark 7.3. Both theorems define aggregation processes (cf. JMPSOf ). For r > n — p, Theorem 
\7.S\ reduces the computation of a nmb(yl) to the same task for the input BX of a smaller size 
n X {r — n + p). Furthermore, suppose that the matrices U and Y have full rank q. Then part (a) of 
Theorem \ 7.4\ implies that Y is a nnib(j4) if q — mil{A), but otherwise parts (b) and (c) reduce the 
original task of computing a nmb{A) to the case of the input AY of a smaller size m x (q — mil{A)). 

Theorem 7.5. Assume that U E R™^''+, V € R"^''+, m > n, a real mxn matrix A has numerical 
rank p = n — r, and the matrix C = A + UV'^ has full rank and is well conditioned. Then p > n~ r+ 
and there is a scalar c independent of A, U, V, m, n and p such that \\C^UX—TA.r\\ < ccrp+i(A)||[/|| 
w/iere X e M'■+^^ X = nmb(ylC+C/ + A), ||A|| < cap+i{A)\\U\\. 

Proof. The theorem turns into Theorem 17.31 ii p — nrank(yl) = rank(^). li p = nrank(74) < 
rank(y4.), set to zero all but the p largest singular values in the SVD of the matrix A. Then p = 
nrank(yl — E) = rank(A — E) and the theorem holds for the resulting matrix A — E and the matrix 
C - E = A - E + UV'^. Therefore TA-E,r = {C - E)+UX where for X we choose an orthogonal 
nmb((A - E)){{C - E)+U), of size r+ x r. Clearly \\TA-E;r - TaMW = 0(crp+i(A)) for some rxr 
orthogonal matrix Q, and it remains to estimate the norm ||(C — E)^UX — C~^UX\\. We have 
\\{{C ~ E)+ ~C+)UX\\ < \\{C - E)+ ~C+\\ \\U\\. The norm ||£;|| ^ ap+i{A) is smah because the 
matrix A has numerical rank p, whereas the norm ||C — is not large because the full rank 

matrix C is well conditioned. Therefore the value r = ||C — £')+|| — ||C+|| has at most order (Tp+i(A) 
by virtue of Theorem 12.21 □ 



Corollary 7.1. Suppose a normalized realnixn matrix A has numerical rank p — n—r, U G Gq^i^*, 

V e Gq^^^ , m > n, and C ~ A + UV"^ . Then (i) the matrix C is singular or ill conditioned if 
r^ < r but otherwise (ii) has full rank with probability \, and (Hi) we can expect that the matrix 
C^UX is an approximate matrix basis for the singular space Ta^t within an error norm of at most 
order ap-^-l{A) where X is an orthogonal nmh^AC'^U + A) of the size r_|_ x r and ||A|| < cap+i{A). 

Proof. Part (i) is immediately verified. Furthermore by virtue of Theorem 15.61 the matrix C has full 
rank with probability 1 and is expected to be well conditioned, whereas the norm \\U\\ is expected 
to be not large by virtue of Theorem 13.21 Therefore Corollary 17.11 follows from Theorem 17.51 □ 



Likewise by employing Theorems 16.31 and 17.41 instead of Theorems 15.61 and 17.31 we obtain the 
following result. 

Corollary 7.2. Suppose that a normalized real mxn matrix A has numerical nullity r = nnul(A), 
U e GZ"\ V e G^.r, W e G^^', ^ = ^^). r^nk{W) - q, rank(i^) = n + q, Y ^ 

{On.q I In)K^ i^l/'^^ ' ^^'^ m > n. Then (i) the matrix K is rank deficient or ill conditioned 

where q < r but otherwise has full rank with probability 1 and is expected to be well conditioned. 
Furthermore we can expect that within an error norm of at most order cr„_g_|_i(A) a matrix basis for 
the singular space TA,q is approximated by (ii) the matrix Y if r = q or (Hi) the matrix YZ if q > r 
where Z e W""'' , Z ^ iimh{ AY + A) , ||A|| < ccr„_<,+i(A). 

Corollaries l7.1l and l7.2l (for s = q) imply correctness of the two following Prototype Algorithms. 

Proto- Algorithm 7.2. An approximate basis for a trailing singular space by using ran- 
domized additive preprocessing. 

Input: A matrix A e JJ™^" fgj- jjl > n with \ \A\\ w 1, an upper bound r_|_ on its unknown numerical 
nullity r — nnul(yl), and two tolerances r and r' of order crn-r+i{A). (The tolerances are 
defined by the requested output accuracy. In a variation of the algorithm one can reapply it 
with a decreased tolerance r' instead of outputing FAILURE at Stage 4-) 
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Output: FAILURE (with a low probability) or the numerical nullity r and an approximate matrix 
basis B, within an error norm in 0{crn-r+i{A)), of the trailing singular space TA,r- 

Initialization; Generate two matrices U G Gq^i^^ and V G Gq^^^ for a of order \\A\\. 
Computations; 

1. Compute the matrix C = A + UV^ . 

2. Stop and output FAILURE if this matrix is rank deficient or ill conditioned. Otherwise 
compute the matrices Y = C'^U and AY . 

3. Output r = r+ and B ^ Y and stop if \\AY\ \ < t\\A\\ \\Y\\. 

4-. Otherwise apply an algorithm (e.g. employing SVD, rank revealing factorization, a tech- 
nigue from \PQ10^ or JPQBj, or one of Proto- Algorithms \ 7. S\ and \ 7. 3^ that for the matrix 
AY and a fixed tolerance r' computes an integer r and an orthogonal approximate matrix 
basis X (of size r_|_ x r) for the space Tj^y.r- If ^ '''W^W then output r and 

B = YX and stop. Otherwise output FAILURE and stop. 

Proto- Algorithm 7.3. An approximate basis for a trailing singular space by using ran- 
domized augmentation. 

Input, Output and Stages 3 and 4 o/ Computations are as in Proto- Algorithm[TM 



Initialization; Generate three matrices U e Gq i , V e Gq i ^ , o-nd W G Gq\ ^ for a of 
order 



Computations; 

( W 

1. Stop and output FAILURE if the matrix K = [ ^ A ] rank deficient or ill condi- 
tioned. 

2. Otherwise compute the matrices Y = (On,r+ | In )K+ {^"^jj"^^ and AY. 

7.6 Alternative methods for the approximation of leading singular spaces 

Next we extend Theorem 17.51 and Corollary 17.11 assuming a nonsingular input matrix and an upper 
bound on the numerical rank of its inverse. 



Theorem 7.6. (Gf. Remark\T^) Assume that five matrices A G R"><", U-,V- G R"^«+, H = 
Ig^ + V-AUT , and C_ = A — AU-H'^^V^ A have full ranks, the matrix C- is well conditioned, 
and q = nrank(A) =< q^. Then there exists a scalar c_ independent of A, U^, V^, n and q^ and 
such that \\C'EV-Y^ — Tq^A\\ < c_(Tq+i(A) where y_ G ]R'^+^'', y_ is a matrix basis for the space 
^A-WTv+A,q and ||A|| < C_cr,+i(^). 

Proof Recall that CZ^ = A-^ -^U-Vl' (cf. ((2l0| ') and that nrank(A) = nnul(A-i). Rewrite SVDs 
A = Sa^aTJ and A-^ ^ TaT^Z^S^ as follows, 



T 



A= {Sq^A I SA,n-q) diag{Y,q^A,^A,n~q){Tq^A \ TA,n-q) 

A-^ = {Tq^A I r^,„-,)diag(I]^-^,E+„_^)(5,,A I SA,n-qf. 
Apply Theorem 17.51 to the matrices A^^ , , and X replacing A, C, and Y, respectively. □ 

Remark 7.4. One can first compute the numerical nullity q — nnul(A) of an ill conditioned matrix 
A (see Section \8.2\ on this computation) and then an approximate matrix basis CTV of the space 
"^q^A- This can be more attractive than computing the matrix A^^CTV. In the next two corollaries 
we assume that the numerical rank of the input matrix A is available. 
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Corollary 7.3. (Cf. Remark IT^) Suppose A £ R"^", U-,V- E G^^'' , a has order \\A-^\\, 
H = Ig+ V^AU^, C_= A- AU_H-^VlA for H = Ig + VlAU_ (cf. ^KW), and q = nrank(A). 
(See Section \5.5\ on estimating the norm \\A~^\\.) Then the matrix C^V- is expected to approximate 
within an error norm of at most order aq^i{A^^) a matrix basis of the leading singular space Tq^A- 

Proof. By virtue of Theoreni l5.6l the matrix C_ of Theoreni l7.6l has full rank with probability 1 and 
is expected to be well conditioned, and so Corollary 17.31 follows from Theorem 17.61 for q ^ q^. □ 



The dual augmentation of the following corollary provides an alternative expression for an ap- 
proximate matrix basis of a leading singular space. 

Corollary 7.4. Suppose that A E R"^", U,V £ ^qct*' ^ ^ Sq^J' , the matrix A is nonsingular, a 
has order ||^^^||, q — nrank(j4), = ^ J^^^ ' ^^^l^(^) ~ and rank(_R'-|_) = n + q. Then 

we can expect that the matrix T+ — (On.q \ In)K^^ i^u'^^ approximates within an error norm in 
0{<7qj^i{A)) a matrix basis for the right leading singular space Tq,A- 

Proof Write SVD A = Sa^aTJ of ^ and deduce that A-'^ = SAi^Ay^Tj where (Sa)"^ = 



diag(l/crj(A))"^j. Note that nrank(A) = nnul(A ^) and apply Corollar v 17.21 replacing the matrix 
A with A^"^. □ 

Closer examination of the expression for the matrix T^^ enables us to simplify it as follows, 

T+=B- BS^W^B for B = A^U (7.4) 
where S = W + U^A^V. S^^ is the only matrix inverse involved into computing r+ (cf. ()2.10p '). 



7.7 Some amendments 

Remark 7.5. Approximation of the leading and trailing singular spaces as well as the computation 
of numerical rank and numerical nullity (see Section \8.2\) are facilitated as the gaps increase between 
the singular values of the input matrix A. This motivates using the power transforms A =^ B^ = 
{AA'^)^A for positive integers h because (Tj{Bfi) = {a'j{A))'^'^^^ for all j . 

Remark 7.6. In the case where m — n the computations are simplified and stabilized, and fur- 
thermore we can apply Theorem \ 7.3\ or \7.4\ to both A and to define both left and right nmbs. 
We can reduce to this case the computation for a rectangular matrix A in various ways, e.g., by 
observing that (a) N{A) = N{A^A), (b) N{A) = N{B'^A) if A,B e E™^" and the matrix B 
has full rank m < n, and (c) {A \ 0„..m_„)u = Om and only if Au = provided m > n and 
u = (/„ I 0„ „i_„)u, whereas {A'^ \ 0„,m-ri)v = 0„ if and only i/ v = 0^ provided m < n and 
V = {Im I 0„_m.m)v. Furthermore given an m x n matrix A for m > n, we can represent it as the 
block vector A — {Bf \ B^ | . • . | B^)"^ where Bi are hi x n blocks for i ~ 1, . . . , h, X]i=i ~ ™7 
and observe thatJ\f(A) = n^^]^A/'(i?i), and we can compute the intersection of null spaces by applying 
\GL96[ Theorem 12.4.1]- One can extend these comments to the tasks of the approximation of the 
singular spaces of ill conditioned matrices. 

8 Sparse and structured randomization. Numerical rank 
without pivoting and orthogonalization 

8.1 Randomized structured preprocessing 

Would the additive preprocessing A C — A + UV"'" preserve the structure of an n x n matrix 
A where A € R™^", f/ e M'"^'' and F € Adding the matrix UV'^ makes smaU impact on 

the structure if the ratio r/ min{m,n} is small, e.g., the displacement rank increases by 0{r) (cf. 
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[POlp . but we can control this impact even for large values r by endowing the matrices U and V 
with proper structure. Given a pair of standard Gaussian random Toeplitz n x r matrices U and V 
and a displacement generator of a small length d for a nonsingular ill conditioned Toeplitz-like nxn 
matrix A that has a numerical nullity r = nnul(A) and a norm ||^|| w 1, we can readily compute a 
displacement generator of length d + 0{l) for the matrix C = A+UV'^ . By exploiting the structure 
we can operate with this matrix in nearly linear arithmetic time, e.g., solve a nonsingular linear 
system Ay = b in 0(c?^nlog^n) flops, even where r is large (see Theorem 12. 9p . Both randomized 



H preserve matrix sparseness and structure even better. Empirically these maps preserve their 
preconditioning properties for such choices of the matrices G, H, U, V, and W; likewise endowing 
random multipliers with sparseness and structure keeps the support for safe numerical GENP and 
block Gaussian elimination (see Remark H^] and Tables [TU31 \WM [TUTTUl and ll0.16p . 

Remark 8.1. Alternative deterministic techniques of homotopy continuation also support inversion 
in nearly linear time of nonsingular Toeplitz matrices and other matrices with displacement structure 
(see JEM Section 6.9], JFOT^ , JEM)- 

8.2 Numerical rank without pivoting and orthogonalization 

If we know the numerical rank p of a matrix A, then we can simplify Proto- Algorithm !?. II for rank-p 
approximation as well as the computation of approximate bases for the leading and trailing singular 
spaces of the matrix A (see Remarks 17.21 and 17.41 and Corollaries 17. II and 17. 2p . 

The customary algorithms for the numerical rank of a matrix rely on computing its SVD or 
rank revealing factorization, which involve pivoting and orthogonalization and thus destroy matrix 
sparseness and structure. Randomized Proto- Algorithm !?. H is a noncostly alternative where the given 
upper bound p+ on the numerical rank is small. Indeed Proto- Algorithm 17.1! uses rank revealing 
factorization at Stage 2 and matrix inversion or orthogonalization at Stage 3, but in these cases only 
deals with matrices of small sizes if is small. 

Next we describe other alternatives that avoid pivoting and orthogonalization even where the 
numerical rank p is large. As by-product they compute an approximate matrix basis within an 
error norm in 0{<7p^i{A)) for the leading singular space Tp^A of an m x n matrix A and if we wish 
also rank-p approximation of the matrix A (see Remark l8.2p . We let to > n (else shift to A'^), let 
= [0,7i] unless wc know a more narrow range, and successively test the selected candidate 
integers in the range until we find the numerical rank p. To improve reliability, we can 

repeat the tests for distinct values of random parameters. 

Exhaustive search defines and verifies the numerical rank p with probability near 1, but with 
proper policies one can use fewer and simpler tests because for G E Gq^i'^ (and empirically for 
various random sparse and structured matrices G as well) the matrix B = A^G is expected (a) to 
have full rank and to be well conditioned if and only if s > p, (b) to approximate a matrix basis 
(within an error norm in 0{ap+i{A))) for a linear space T D Tp.s = Tp^A where s > p, and (c) to 
approximate a matrix basis (within an error norm in 0((Tp+i(A))) for the space Tp^A where s — p. 
Property (a) is implied by Theorem 14. 11 properties (b) and (c) by Theorem 17.21 

Proto- Algorithm 8.1. Numerical rank with random sampling (see Remarks 1 8. 2\ - \8. 4 j ) ■ 

Input: Two integers p_ and p+ and a matrix A G jj'nx" having unknown numerical rank p = 
rank(A) in the range [p_,p-|-] such that < P- < p+ < n < m, a rule for the selection of a 
candidate integer p in a range [p_,p+], and a Subroutine COND that determines whether a 
given matrix has full rank and is well conditioned or not. 

Output: an integer p expected to equal numerical rank of the matrix A and a matrix B expected to 
approximate (within an error norm in 0{ap-i-i{A))) a matrix basis of the singular space '^p,A- 
(Both expectations can actually fail, but with a low probability, see Remark \8.S[ ) 




proprer choice of the random blocks U, V, and W 



AH for proper choice of random matrices G and 
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Initialization: Generate matrix G € Gl^^i''^ and write B ^ A, Gp = G{Ip \ Op^m-pY for P = 
P-,P- + 1, • ■ • 

Computations; 

1. Output p — and the matrix B and stop if p- = p+- Otherwise fix an integer p in the 
range [p_ ,p+\. 

2. Compute the matrix B' — B^Gp and apply to it the Subroutine COND. 

3. If this matrix has full rank and is well conditioned, write p^ = p and B = B' and go to 
Stage 1. Otherwise write p- ~ p and go to Stage 1. 

Remark 8.2. The algorithm can output a wrong value of the numerical rank, although by virtue of 
Theorems \ 7. 1\ and \ 7. 2\ combined this occurs with a low probability. One can decrease this probability by 
reapplying the algorithm to the same inputs and choosing distinct random parameters. Furthermore 
one can fix a tolerance t of order ap+i{A), set T = B, and apply Stage 3 of Proto- Algorithm \7.1\ 
Then nrank(A) is expected to exceed the computed value p if this stage outputs FAILURE and to 
equal p otherwise, in which case the algorithm also outputs a rank-p approximation of the matrix A 
(within an error norm t\\A\\ in 0{apj^i[A))). For a sufficiently small tolerance r the latter outcome 
implies that certainly p > nrank(A). 

Remark 8.3. A Subroutine COND, which tests whether an m x p matrix B' has full rank and is 
well conditioned, can employ SVD of the matrix A or its rank revealing factorization, thus involving 
pivoting or orthogonalization. We can avoid this charge on matrix sparseness and structure by using 
randomization (although this is less important where p+ is small). Namely assume that, say m > n 
and recall that the algorithm of \D83f computes a close upper bound cr^ on the largest eigenvalue 
of the matrix S — A by recursively computing the vectors Vj = S'^v = S'Vi_i, i = 1,2,... for a 
random vector v = vq. By reapplying this algorithm to the matrix cr^I — A we can approximate 
the absolutely smallest eigenvalue of the matrix S, which is actually equal to a'^{A). Here we just 
need a crude estimate to support our algorithm. 

Remark 8.4. The binary search p — [(p_+p+)/2] is an attractive policy for choosing the candidate 
values p, but one may prefer to move toward the left end p_ of the range more rapidly, to decrease 
the size of the matrix B' . 

In principle in our search for numerical rank we can employ Corollary 17.11 or 17.21 instead of 
Theorem 17.21 Then we would have to apply the Subroutine COND to matrices of size m x n 
or larger, which means extra computational cost. Because of that the two respective Prototype 
Algorithms below cannot compete with Proto- Algorithm 18.11 unless the input matrix has a small 
numerical nullity. 

Proto- Algorithm 8.2. Numerical rank via randomized additive preprocessing. 

Input, Output and Stage 1 o/ Computations are the same as in Proto- Algorithm Wl\ 

Initialization; Compute the integer — n — p_ and a scalar a of order \\A\\, generate 
two matrices U+ £ Gq^^^* and V+ G G^^^^ ^ and write Us = U+{Is \ Os^m-s)^ and Vs = 
V+ih I On-s.sV for s^r_,r_ + l,...,r+. 

Computations; 

2. Compute the integer s = n — p. Compute the matrix C ^ A + UgV^ and apply to it the 
Subroutine COND. 

3. If this matrix is rank deficient or ill conditioned write p^ = p and go to Stage 1. Otherwise 
write p- = p and go to Stage 1. 

Proto- Algorithm 8.3. Numerical rank via randomized augmentation. 
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Input, Output and Stages 1 and 3 o/ Computations are the same as in Proto-Alaorithm [KR 



three matrices U+ G G^^^^ , V+ G G^^^^ : o.'^d W+ G Gq^J^^^ , and write i — 1, Aq — A, 

Us = U+{h I Os,^-sV', Vs = V+{Is I On-s.sV, and Ws = (/. | 0.,.+ -,)W^+(/. | Os,r^-sV 



Initialization; Compute the integer r_|_ = n — p_ and a scalar a of order \\A\\, generate 

for s — r_,r_ + 1, . . . ,r^. 
Computations; 

2. Compute the integer s = n — p. Compute the matrix K = [ ^ ^1 and apply to it the 
Subroutine COND. 

8.3 Preprocessing for Newton— Toeplitz iteration 

Newton's iteration for matrix inversion 

X,+i = X,{21 - CX,), i = 0, 1, . . . . (8.1) 
squares the residuals / — CXi^ that is, 

/ - CX^+i = {I- CX^f = {I- CXof*' , i ^ 0, 1, . . . . (8.2) 

Therefore 

< \\I-CX£ - \\I-CX^\f*\ 1 = 0,1,..., (8.3) 

and so the approximations Xi quadratically converge to the inverse ^ right from the start provided 
that ||/ — CXqII < 1. We can ensure that ||/ — CXqH < 1 — (k(c))"(i+k) choosing Xq = 
(i-i-„)j^cjj^l|ig|| — [PS91) . Newton's iteration can be incorporated into our randomized algorithms. 
E.g., we can use it instead of Gaussian elimination in Proto-Algorithm 19.41 of the next section. 

The map C X^ preserves the matrix structure of Toeplitz or Hankel type, but is the struc- 
ture maintained throughout the iteration? Not automatically. In fact Newton's loop can triple the 
displacement rank of a matrix Xk- The papers jP92) . jP93| . and |P93a) . however, have proposed to 
maintain the structure via recursive compression of the displacements (one can also say recompres- 
sion), thus defining Newton's structured (e.g., Newton-Toeplitz) iteration. Recall that we can readily 
recover a Toeplitz-like matrix from its displacement (cf. ()2.13p '). According to the compression pol- 
icy proposed in the papers [P92) . |P93) . and |P93a) . one should periodically set to the smallest 
singular values of the displacements of the matrices Xi to keep the length of the displacements 
within a fixed tolerance, equal to or a little exceeding the displacement rank of the input matrix 
C. According to the estimates in [POlj . the Newton-Toeplitz iteration converges quadratically right 
from the start provided ||/ — CXo|| < (^i+ \ \z \\ + \\ Zf \\ )K{C) \\^~^\\' 11-^""^ 1 1 — Ce,/n, L denotes the 
associated displacement operator L : C => Z^C — CZf ioi e f or L : C C — ZgCZj for 
e/ ^ 1, and Cg,/ is a constant defined by e and /. Similar bounds can be deduced for other classes 
of matrices with displacement structure [POl . Section 6.6]. See |PBRZ99j . |PRW02| . [POTl Chapter 
6] and |P10j on further information. The cost of computing the n x d generator matrices G and H 
with SVD or rank revealing factorization is not high for small ranks d, but randomized methods of 
Section [7] enable further cost decrease. 



8.4 Application to tensor decomposition 

Let 

A=[A{n,...,id)] (8.4) 

denote a d-dimensional tensor with entries A{ii, . . . ,id) and spacial indices ii, . . . ,id ranging from 
1 to ni, . . . , Ud, respectively. Define the d — 1 unfolding matrices A^ = [A{ii . . .ik', ik+i ■ ■ ■ id)], k = 
1, . . . , d, where the semicolon separates the multi-indices ii . . .ik and ik+i ■ ■ - id, which define the 
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rows and columns of the matrix Ak, respectively, k = l,...,d. The paper |O09j proposed the 
following class of Tensor Train Decompositions, hereafter referred to as TT Decompositions, where 
the summation indices ai, . . . , a^-i ranged from 1 to compression ranks ri, . . . , r^-i, respectively, 

T= ^ Gi{ii,ai)G2{ai,ii,a2) ■ ■ ■ Gd-i{ad-2,id-i,ad~i)Gd{ad,id)- (8.5) 

ai,...,ad-i 

Theorem 8.1. fOO^ . For any tensor A of there exists a TT decomposition 118. 5]) such that 
A = T and r^ — rank( Afc ) for k = \, . . . ,d — 1. 

There is a large and growing number of important applications of TT decompositions (jS.Sp to 
modern computations (cf. e.g., |OT09) . |OT10) . |0T11] ) where the numerical ranks of the unfolding 
matrices A^ are much smaller than their ranks, and it is desired to compress TT decompositions 
respectively. 

Theorem 8.2. \0T1(^ . For any tensor A of l{8.4\ ) and any set of positive integers r^ < rank(^fe), 
k = \, . . . ,d — 1, there exists a TT decomposition i8. 5\) such that 

d-l 

||A-T||2, < Vr2, Tfc = min \\Ak - B\\f, k = I, . . . ,d - I. (8.6) 

f — ■ rank(B)=rfc 

The constructive proof of this theorem in [OTlOj relies on inductive approximation of unfolding 
matrices by their SVDs truncated to the compression ranks r^. Let us sketch this construction. 
For c? = 2 we obtain a desired TT decomposition T{ii,i2) — X^ai Gi(*1j 0^1)^2 (cii, 12) (that is 
a sum of ri outer products of ri pairs of vectors) simply by truncating the SVD of the matrix 
A{ii, 12). At the inductive step one truncates the SVD of the first unfolding matrix Ai = S'^j^S^^Tj^ 
to obtain rank-ri approximation of this matrix Bi = Sbi^BxT'^_^ where = d\a.g{crj{Ai))Y^-^ 
and the matrices Sbi and Tb^ are formed by the first ri columns of the matrices Sax and Ta^ , 
respectively. Then it remains to approximate the tensor B = [B{ii, . . . ,id)\ represented by the 
matrix Bi. Rewrite it as '^'si (*i; Q^i)^('^i; *2 . . . id) for A — X^Si^Si' represent A as the 

tensor A = [A{aii2,i5, ■ ■ ■ ,id)\ of dimension d — 1, apply the inductive hypothesis to obtain a 
TT-approximation of this tensor, and extend it to a TT-approximation of the original tensor A. 

In [OTlOj the authors specify this construction as their Algorithm 1, prove error norm bound 
(|8.6p , then point out that the "computation of the truncated SVD for large scale and possibly dense 
unfolding matrices ... is unaffordable in many dimensions", propose "to replace SVD by some other 
dyadic decompositions A^ w UV"^ , which can be computed with low complexity", and finally specify 
such recipe as [OTIOI Algorithm 2], which is an iterative algorithm for skeleton or pseudoskeleton 
decomposition of matrices and which they use at Stages 5 and 6 of their Algorithm 1. The cost of 
each iteration of [OTlOl Algorithm 2] is quite low and empirically the iteration converges fast, but 
the authors welcome alternative recipes having formal support. 

Our randomized Proto- Algorithms 17.11 and 18 . II can serve as the alternatives to [OTIOI Algorithm 
2]. For the input matrix Ai above they use 0{ri) multiplications of this matrix by 0{ri) vectors, 
which means a low computational cost for sparse and structured inputs, whereas the expected output 
is an approximate matrix basis for the space §ri,Ai or T^^^Ai and a rank-ri approximation to the 
matrix Ai, within an expected error norm in 0((Tri+i(^i)). This is the same order as in [OT10[ 
Algorithm 1], but now we do not use SVDs. One can further decrease the error bound by means of 
small oversampling of Remark 17.21 and power transform of Remark 17.51 

9 2x2 block triangulation of an ill conditioned matrix, matrix 
inversion, and solving linear systems of equations 

In this section we apply the results of Section [7] to compute 2x2 block triangulation and the inverse 
of a nonsingular ill conditioned matrix. In Section 19.31 which can be read independently of the rest 
of the present section, we combine additive preconditioning and the SMW formula to solve a linear 
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system of equations with such matrix. One can alternatively combine dual additive preprocessing 
with the dual SMW formula or apply these techniques to precondition the input matrix. (See 
Remarks 19.41 and 19. 6|) . We partition some algorithms of this section into symbolic and numerical 
stages. At the former stage we perform computations with infinite precision, but they use a small 
fraction of the double precision flops involved. 

9.1 Block triangulation using approximate trailing singular spaces 

In Section [7.51 we have approximated leading and trailing singular spaces of ill conditioned matrices 
by applying randomized additive preprocessing, random sampling, or augmentation. Next we extend 
the former algorithms (based first on additive preprocessing of Theorem 17.51 and Corollary 17.11 and 
then on random sampling of Section 17.41) to 2 x 2 block triangulation of these matrices. One can 
similarly compute their block triangulation by extending either the algorithms of Section 17.51 based 
on randomized augmentation of Corollary 17.21 (we leave this to the reader) or the algorithms of 
Sections 17.41 and 17.61 for approximate matrix bases of the leading singular spaces of the input matrix 
(see the next subsection). 

Proto- Algorithm 9.1. Block triangulation with randomization and orthogonalization. 

Input: A matrix A G M"^" whose norm \\A\\ is neither large nor small, its numerical rank q 
satisfying 0<q^n — r<n, and a Subroutine LIN-SOLVE that either solves a linear system 
of equations if it is nonsingular and well condtioned or outputs FAILURE otherwise. 

Output: FAILURE (with a low probability) or four orthogonal matrices Kq and Lq in M"^'' and 
Ki and Li in M"^'" such that with a probability near 1 the qxq block submatrix Woo = ALq 



and strongly dominant such that o'g(Woo) ^ max{||IVoi||, llW^ioll; l|W^ii||}- 
Computations (see Remark WT]) : 

1. Generate two matrices U,V Cz Gq^*^. 

2. Compute the matrix C = A + UV'^ , expected to be nonsingular and well conditioned. 

3. Apply the Subroutine LIN-SOLVE to compute the matrices C~"^V and C^^U. Stop and 
output FAILURE if so does the .subroutine. 

4. Compute and output two orthogonal matrices Ki = Q{C~^U) and Li = Q(C~'^V). 

5. Compute and output two orthogonal nmbs Kq = nmb(/'iri) and Lq = nmb(ii). 

The algorithm can only fail with a low probability by virtue of Theorems l5.6l and l7.5l and Corollary 
17.11 We use the following theorem to prove correctness of the algorithm. 

Theorem 9.1. For a matrix A € qjk]^ Q < q < / = min{ri, m}, write r = n — q and f — m — q. 



Let Ka e R™><9, Lq e Ki e IR'"^^*^, Li € E"^'^, and Qk: Ql e W'' be six orthogonal matrices 

such that Ki = SA,fQK, -^1 — Ta.tQl, Kq — Of ^ and LJLq — Or.q- Then \\Kj A\\ < aq^i{A), 
\\ALi\\ < ay+i{A),'\\KoALo\\ = cri{A), and k{KqALq) = ai{A)/ay{A). 



SAdia.g{Om-r,n-r,{<^j{A^))^=q+iQL), and so ll^iill < (Tg+i(A) because Sa and Ql are orthogonal 
matrices. Similarly obtain that ||/'irj'"A|| < aq+i{A). 

Next deduce from the assumptions about Lq and Li that Lq = Tq^AQa for an orthogonal matrix 
Qq e W^'J and similarly that Kq — Sq,AQo for an orthogonal matrix Qq e R''^''. Therefore 




Proof Suppose A = Sa^aTJ is SVD of ([13]). Then ALi = SA^ATjTA,rQL = Sa^a 




KqALq = Q'^SlASA^ATlTq^AQa = (Q'q I 0„,,^)Ea(Qo | Or,nf = Q[, diag(a, ( A))]^, , 



and so ||A'oALo|| = cri(^), k{KqALq) = cri{A)/aq{A). 



□ 
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In Proto- Algorithm 19 . 1 1 we expect to have TZ{Li) w Ta.t by virtue of Theorem 1 7 . 5 1 and similarly 
to have TZ{Ki) w SA,r- Theorem 19.11 implies that the norms and ||ALi|| have an upper 

bound close to aq+i{A), whereas /{(iCjALo) ~ "^i (^) Z''"? (^) ■ Now correctness of the algorithm 
follows because the matrix A has numerical rank q. 

We can proceed with nonorthogonal matrices Kq, Ki, Lq, ii, Qk, and Qi to simplify the 
computations, by weakening numerical stability a little. Then we can still expect that the norms 
||W^oi||, ||W^io||: and ||Wii|| have at most order crg+i(A), the norm ||M^oo|| bas order ai{A), and the 
condition number k{Woo) has order ai{A)/aq{A). Moreover choosing random matrices Kq G Q^^"^ 
and Lq € ^/q 1*' which are expected to be well conditioned by virtue of Theorems 13.11 and 13.21 
combined, and we can still extend our probabilistic estimates for the values ||Wij|| for i,j = 1, 2 and 
k(Woo). Here is our respective simplified algorithm. Our tests in Section [10.41 show its efficiency. 

Proto- Algorithm 9.2. Simplified randomized block triangulation. 

Input, Output and Stages 1 and 2 of Computations are the same as in Proto- Algorithm \9.1\ 
except that the output matrices Kq, Lq, Ki and Li are no longer assumed to be orthogonal. 

Computations: 

3. Generate and output two random matrices Kq,Lq G Gq^"^ ■ 

4- Apply the Subroutine LIN- SOLVE to compute and to output the matrices Ki = C^^V and 
Li = C^^U . Output FAILURE and stop if so does the subroutine. 

Proto- Algorithms 19.11 and 19.21 do not produce block triangulation but prepare it. Having strong 
domination of the block W^oo: we can readily compute the block factorizations 



W = 



I 0\ /Woo Woi 
WwWqt} I [ O G 



for G = Wii- WiqWqq'^Wqi and 

XV -Wrr'WoiG-^\ (I O 



W 



-1 _ / ''•'oo -►^''00 
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Both Proto- Algorithms 19.11 and 19.21 reduce the inversion of the matrix A to the inversion of the 
matrices W^oo and G of smaller sizes, where both matrices are expected to be nonsingular and better 
conditioned than the matrix A (cf. |PGMQ[ Section 9]). 

Remark 9.1. We expect to arrive at the matrices W^i, Wiq and Wn having small norms. To 
counter the expected cancellation of the leading digits of the 2rn — r^ entries of these matrices, we 
■should compute the matrices G, Ki and Li, their products by the blocks of the matrix A, and the 
Schur complement G with a high precision (or partly symbolically, with infinite precision) . These 
computations involve 0{n^r) flops, that is a r/n fraction of order flops in high precision p+ 
required by Gaussian elimination. See further study in \PGMQ\ Section 9]. Having implemented 
this part of the computations with higher precision, we have outperformed the standard algorithms 
(see Section{TUj\and Tables [TOJl\ and llO.l^) . 



9.2 Block triangulation using approximate leading singular spaces 

Suppose a square matrix A has a small positive numerical rank q and define a dual variation of 
Proto- Algorithm 19. II bv applying Theorem 17.61 In this case matrix inversions are limited to the case 
oi qx q matrices H, KqKq and LqLq. In our dual algorithm we assume that the nonsingular input 
matrix A has been scaled so that the norm \ \A~^ \ \ is neither large nor small. See some recipes for the 
approximation of this norm at the end of Section [5] and see Remark 19.31 on how to proceed without 
estimating this norm. 
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Proto- Algorithm 9.3. Block triangulation using approximate leading singular spaces. 

Input: A nonsingular ill conditioned matrix A e R"^" scaled so that the norm is neither 

large nor small; the numerical rank q of the matrix A such that 0<q — n — r<n, and a 
Subroutine INVERT that either inverts a matrix if it is nonsingular and well conditioned or 
outputs FAILURE otherwise. 

Output: FAILURE (with a low probability) or four matrices Kq, Lq e R"^* and Ki, Li e R"^*" 
such that 

l':)-iK,\K,rAiL,\L,) 

and the block submatrix Woo = AL^) is expected to be nonsingular, well conditioned, and 
strongly dominant such that crg(Woo) ^ inax{||Woi||, l|W^io||i l|T^ii||}- 

Computations: 

1. Generate two matrices C/_ and V- in Qq^"^ . 

2. Compute the matrix H = Ig + V^AU- of ifO|) . 

3. Apply the Subroutine INVERT to compute the matrix H~^. Output FAILURE and stop 
if so does the subroutine. 

4. Compute the matrix C- = A- AU-H-^V^A of l2AU\) . 

5. Compute and output the matrices Kq — C_?7_/||C_[/_|| and Lo — C'EV-^ /\\C'EV-\\. 

6. Compute the matrices M w nm h{K^) and N w nmh{L^) (see our Section\T5\ \PQ10^ 
and \PQa\ on the approximation ofnmbs). 

7. Compute and output the matrices Ki = A//||M|| and Li — A^/||iV||. 

The algorithm fails with a low probability by virtue of Theorems 15.61 and 17.61 and Corollary 17.31 
Complete the correctness proof by extending Theorem 19.11 

Remark 9.2. As in the previous subsection, we must perform a small fraction of our computations 
with high accuracy. Namely we must compute the matrix H with high or infinite precision, but for 
that we need 0{qn'^) flops, versus order high precision flops in Gaussian elimination. Unlike 
the previous subsection, this stage involves only matrix multiplications, a matrix addition and no 
inversions, although we need matrix inversions for computing nmbs at Stage 6. 

Remark 9.3. Instead of applying Theorem \7.6] we can employ any other algorithm that computes a 
pair of approximate matrix bases Kq and Lq for the left and right leading singular spaces. E.g., we 
can apply randomized dual augmentation of Corollary \7.4\ or just compute Kq — A'^V and Lq = AU 
for V G ^0 i" '^"'^ ^ ^ ^0 i'' ('^f- Proto- Algorithm \7.1\ ). In a heuristic variation we can choose the 
matrices U,V'^ E 'Tqi'^ where A is a Toeplitz-like matrix and where the numerical rank q is not 
small. The latter computation requires no estimates for the norm \ \A~^\\ and in our tests has led 
to higher output accuracy than Proto- Algorithm \9.3\ For a further heuristic simplification in the 
case where n = 2q, choose the Toeplitz matrices U and V in the form F — {Z \ T), Z £ and 

ej e -Zq^", and then /"^) *^ ^ nmb{F). 

9.3 Randomized additive preconditioning with the SMW recovery 

Suppose that we seek the solution y = A^^b of a real nonsingular ill conditioned linear system 
Ay = b of n equations where we are given a small upper bound r on the numerical nullity of A. 
Assume that the norm | \A\ \ is neither large nor small. Then randomized additive preprocessing A =^ 
C = A + UV'^ for U,V e Gqi^ is expected to produce a well conditioned matrix C (cf. Theorem 
EH). The SMW formula implies that y = C^^b + C^^J/G" V^C^^b for G = Ir - V'^C-^U. 
Substitute Xu = C^^U and Xb = C~^h and obtain 
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y = Xb + XuG-W^Xb for G = - V^Xu- (9.1) 

This reduces the computation of y essentiaUy to the solution of the matrix equation CX = (U \ h) 
for X = {Xij I Xb), computing the matrix G, and its inversion. The solution algorithm below 
incorporates iterative refinement at this stage. 

Proto- Algorithm 9.4. Randomized Solution of a Linear System with Iterative Refine- 
ment. 

Input: a vector b g M"'^-'^, a nonsingular ill conditioned matrix A £ M"^", and its numerical 
nullity r — mm\{A) (cf. Remark \9.5\) . 

Output: A vector y « A^^h. 

Computations; 

1. Generate two matrices U,V £ Gq^^ ■ 

2. Compute the matrix C — A + UV'^ , expected to be nonsingular and well conditioned. 

3. Apply Gaussian elimination (or another direct algorithm) to compute an approximate inverse 
Y w . (Perform the computations in double precision. Application of the same algorithm 
to the original ill conditioned linear system Ay — b would require about as many flops but in 
extended precision.) 

4. Apply iterative refinement employing the approximate inverse Y to compute sufficiently ac- 
curate solution Xjj of the matrix equation CXu — U . (High accuracy is required to counter 
the cancelation of leading bits in the subsequent computation of the Schur complement G = 
Ir — V'^C~^U .) Then recover a close approximation to the vector y = A^^h by applying 
equation 119. 1]) . 

The algorithm reduces the original task of computations with ill conditioned matrix A to the 
computations with the well conditioned matrix C and 0{n^r) additional flops. To handle an ill con- 
ditioned input A, we must perform computations with extended precision to counter magnification 
of rounding errors, but we can confine this stage essentially to computing the Schur complement 
G = Ir — V'^G^^U. This is a small fraction of the computational time of the customary algorithms 
for a linear system Ay — h provided the ratio r/nis small and the precision required to handle the 
ill conditioned matrix A is high. 

Let us supply some estimates. To support iterative refienment we must use a precision p exceeding 
log2K(G); for well conditioned matrices G we can assume that p > 21og2K(G), say. Then order 
p — log2 n{G) new correct bits are produced per an output value by a loop of iterative refinement 
(see |PGMQ[ Section 9]), reduced essentially to multiplication of the matrices G and Y by 2r vectors, 
that is to (4n — 2)nr flops in a low (e.g., double) precision p. The refinement algorithm outputs 
order rn values; one can accumulate them with high accuracy as the sums of sufficiently many low 
precision summands (as in symbolic lifting |P11| ). Overall with this advanced implementation we 
only perform 0{rn^p+/p) flops in low precision p at Stage 4 of Proto- Algorithm 19.41 

For comparison Gaussian elimination uses -I- O(n^) flops in extended precision « pout + 
log2 n{A) to output the solution to the ill conditioned linear system Ay — b with a prescribed 
precision Pout- We compute an approximate inverse Y of the well conditioned matrix G at Stage 3 
by using |n'^ -|- 0{n^) flops as well, but in the low precision p. The cost of performing Stages 1 and 
2 is dominated, and so our progress is significant where np ^ rp^ and p+ greatly exceeds p. 

The estimated computational cost further decreases where the matrices A, U and V have consis- 
tent patterns of sparseness and structure. E.g., the decrease is by a factor n where they are Toeplitz 
or Toeplitz-like matrices. 

Remark 9.4. Given a nonsingular n x n matrix A (with « and a small upper bound q 

on its numerical rank, we can define a dual variation of Proto- Algorithm \9.4\ based on Corollary \7.3\ 
as follows: generate a pair of matrices U-,V- G Gq^'' and then compute the matrices H of \2. 9) 
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and C_ of iS.lOfI to reduce the solution of a linear system of equations Ay ^ h to computing the 
vector y = (CZ^ — U-V]^)h (cf. i2.9\) ). The matrix H — Iq + VZ AU- must he computed with high 
accuracy, hut this only requires 0(qn^) flops in high or infinite precision. Furthermore, unlike the 
computations hy means of the SMW formula, we only need matrix multiplications and an addition 
(and no inversions) at this stage. Similarly one can employ Corollary \ 7.4\ and expression ( [7.^[ ) 
instead of Corollary \7.3\ 

Remark 9.5. There is no point for applying Proto-Algorithm \9.4\ where the matrix A is well con- 
ditioned or has numerical nullity exceeding r. In the former case the preconditioning is not needed, 
whereas in the latter case additive preprocessing would produce an ill conditioned matrix C . In hoth 
cases preprocessing would give no benefits but would involve extraneous computations and additional 
rounding errors. In the case where r is equal to the numerical nullity of A, however, these deficiencies 
can be overwhelmed by the benefits of avoiding order high precision flops. 

Remark 9.6. Instead of using additive preconditioners directly for solving linear systems, one can 
combine them with the SMW or dual SMW formulae to obtain multiplicative preconditioners. Assume 
a nonsingular ill conditioned n x n matrix A and let A^ denote the inverse A~^ computed with 
double precision. Then the matrices A- A and AA^ are much better conditioned than the matrix 
A according to the experiments of \R90^ . Note that both linear systems of equations A- Ay = A-h 
and AA_x = b, for y — ^_x, are equivalent to the system Ay = b. This empirical technique 
is interesting itself and can probably be advanced by means of its recursive application. It may 
also accentuate the preconditioning power of our randomized preprocessing. Instead of defining the 
multiplier A- as the inverse A~^ computed with double precision, we can compute this multiplier as 
C^^{In — UG^^V^C"^) by applying the SMW formula. Moreover we can drop the factor C^^ and 
write either A^ = I^-UG-^V^C-^ or A_ = 1,,-C-^UG-^V'^ to have AA^ = A- AUG-^V'^ G-^ 
andA^A = A-UG-^V^G-^A or AA^ = A-AC-^UG-^V^ and A. A = A-G-^UG-^V'^ A. We 
can similarly utilize the dual SMW formula of \2. and \2.10\) . in which case we can compute the 
multiplier A_ = CZ^-U-V'^ for C_ of (27W^, and then we would have AA_ = {I-AUR-^V'^)-^- 
AU-Vl and A^A = {I - UR-^V'^A)-^ - U-ViA for H ^Iq + V^U^T of fOI) . 

10 Numerical Experiments 

Our numerical experiments with random general, Hankel, Toeplitz and circulant matrices have been 
performed in the Graduate Center of the City University of New York on a Dell server with a dual 
core 1.86 GHz Xeon processor and 2G memory running Windows Server 2003 R2. The test Fortran 
code was compiled with the GNU gfortran compiler within the Cygwin environment. Random 
numbers were generated with the random_number intrinsic Fortran function, assuming the uniform 
probability distribution over the range {x : —1 < x < 1}. The tests have been designed by the first 
author and performed by his coauthors. 

10.1 Conditioning tests 

We have computed the condition numbers of random general nxn matrices for n = 2*^, /c = 5, 6, . . . , 
with entries sampled in the range [—1,1) as well as complex general, Toeplitz, and circulant matri- 
ces whose entries had real and imaginary parts sampled at random in the same range [—1,1). We 
performed 100 tests for each class of inputs, each dimension n, and each nullity r. Tables fT0.2Hl 0.41 
display the test results. The last four columns of each table display the average (mean), minimum, 
maximum, and standard deviation of the computed condition numbers of the input matrices, re- 
spectively. Namely we computed the values k{A) = \ \A\\ \\A^^\\ for general, Toeplitz, and circulant 
matrices A and the values ki{A) — \\A\\i \\A~^\\i for Toeplitz matrices A. We computed and dis- 
played in Table [TO^ the 1-norms of Toeplitz matrices and their inverses rather than their 2-norms to 
facilitate the computations in the case of inputs of large sizes. Relationships (|2.2p link the 1-norms 
and 2-norms to one another, but the empirical data in Table [TUTT] consistently show even closer links, 
in all cases of general, Toeplitz, and circulant nxn matrices A where n = 32, 64, . . . , 1024. 
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10.2 Preconditioning tests 



Table [TU31 covers our tests for the preconditioning power of additive preprocessing in |PIMRld] . We 
have tested the input matrices of the following classes. 

In. Nonsymmetric matrices of type I with numerical nullity r. A ^ SJ^rT"^ are n x n matrices 
where S and T are nx n random orthogonal matrices, that is, the factors Q in the QR factorizations 
of random real matrices; = diag(crj)"^-^ is the diagonal matrix such that cTj+i < cTj for j = 
1, . . . , n — 1, (Ti = 1, the values a2, ■ ■ ■ , cr„_, _i are randomly sampled in the semi-open interval 
[0.1, 1), an-r = 0.1, (Tj = 10-16 for j = n - r + 1, . . . , n, and therefore k{A) = lO^^ [H02l Section 
28.3]. 

Is. Symmetric matrices of type I with numerical nullity r. The same as in part In, but for 

The matrices of the six other classes have been constructed in the form of jpjj + (31 where the 
recipes for defining the matrices A and scalars /3 are specified below. 

2n. Nonsymmetric matrices of type II with numerical nullity r. A = (W \ W Z) where W and Z 
are random orthogonal matrices of sizes n x [n — r) and in — r) x r, respectively. 

2s. Symmetric matrices of type II with numerical nullity r . A = WW"^ where W are random 
orthogonal matrices of size n x [n — r). 

3n. Nonsymmetric Toeplitz-like matrices with numerical nullity r. A = c{T \ TS) for random 
Toeplitz matrices T of size n x (n — r) and S of size [n — r) x r and for a positive scalar c such that 
\\A\\^1. 

3s. Symmetric Toeplitz-like matrices with numerical nullity r. A — cTT^ for random Toeplitz 
matrices T of size n x {n — r) and a positive scalar c such that ||^|| 1. 

4n. Nonsymmetric Toeplitz matrices with numerical nullity \. A — is a Toeplitz nx n 

matrix. Its entries random for z — j < n — 1. The entry a„_i is selected to ensure 

that the last row is linearly expressed through the other rows. 

4s. Symmetric Toeplitz matrices with numerical nullity 1. A — (ai.j)fj=i is a Toeplitz nx n 
matrix. Its entries random for |« — j | < n — 1, whereas the entry ai_„ = a„.i is a root 

of the quadratic equation detA — 0. We repeatedly generated the matrices A until we arrived at 
the quadratic equation having real roots. 

We set (3 = lO-"'^^ for symmetric matrices A in the classes 2s, 3s, and 4s, so that k{A) — 10^^ + 1 
in these cases. For nonsymmetric matrices A we defined the scalar /3 by an iterative process such 
that ||A|| « 1 and 10-i**P|| < k{A) < 10-^^\\A\\ [PIMRlOl Section 8.2]. Table [105] displays the 
average values of the condition numbers k{C) of the matrices C = A+UU'^ over 100,000 tests for the 
inputs in the above classes, r = 1, 2, 4, 8 and n = 100. We defined the additive preprocessor UU^ by 
a normalized nxr matrix U = U/\\U\\ where — (±/ | Or.r \ ±^ | Or.r I • ■ • I Or^r \ H \ Or.s), we 
chosen the integer s to obtain nxr matrices U and chosen the signs for the matrices ±/ at random. 
In our further tests the condition numbers of the matrices C = A + lO^UV^ for p = —10, —5, 5, 10 
were steadily growing within a factor lO'^' as the value \p\ was growing. This showed the importance 
of proper scaling of the additive preprocessor UV^ . 

10.3 GENP with random circulant multipliers 

Table 110.61 shows the results of our tests of the solution of a nonsingular well conditioned linear 
system Ay = b of n equations whose coefficient matrix had ill conditioned n/2 x n/2 leading 
principal block for n = 64, 256, 1024. We performed 100 numerical tests for each dimension n and 
computed the maximum, minimum and average relative residual norms ||^y — b||/||b|| as well as 
standard deviation. GENP applied to these systems has output corrupted solutions with residual 
norms ranging from 10 to 10^. When we preprocessed the systems with circulant multipliers filled 
with ±1 (choosing the n signs ± at random), the norms decreased to at worst 10"'' for all inputs. 
Table fT0?6l also shows further decrease of the norm in a single step of iterative refinement. Table 
2 in |PQZaj shows similar results of the tests where the input matrices were chosen similarly but 
so that all their leading blocks had numerical nullities and 1 and where Householder multipliers 
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/„ — uv"^/u"^v replaced the circulant raultipliers. Here u and v denote two vectors filled with integers 
1 and —1 under random choice of the signs + and — . 



10.4 Approximation of the tails and heads of SVDs and low-rank appro- 
ximation of a matrix 

At some specified stages of our tests of this subsection and Section [10.51 we performed additions, 
subtractions and multiplications with infinite precision (hereafter referred to as error-free ring op- 
erations). At the other stages we performed computations with double precision, and we rounded 
to double precision all random values. We performed two refinement iterations for the computed 
solution of every linear system of equations and matrix inverse. 

Table 110.71 shows the data from our tests on the approximation of trailing singular spaces of 
the SVD of an n x n matrix A having numerical nullity r — n — q and on the approximation of 
this matrix with a matrix of rank q — n — r. For n — 64, 128,256 and q = 1,8,32 we generated 
n X n random orthogonal matrices S and T and diagonal matrices S = diag(crj)"^]^ such that 

= j ^ l,...,q, aj ^ 10-1°, j ^ q + l,...,n (cf. [H02l Section 28.3]). Then we applied 
error-free ring operations to compute the input matrices A = Sa'S^aT'^, for which \\A\\ = 1 and 
k{A) = 10^". Furthermore we generated pairs of random nx r matrices U and V for r = 1,8,32, 
scaled them to ||?71^"^|| ~ 1, and computed the matrices C ^ A + UV^ (by applying error- free ring 
operations), Ba.t — C~^U (by using two refinement iterations), Ta^t, Ba^tYa.t as a least-squares 
approximation to TA,r, Q ~ Q{BA,r), and A— AQQ^ (by applying error-free ring operations). Table 
110.71 summarizes the data on the values k(C) and the residual norms rni ||i?yi,ryA.r — 2^-A,r || and 
rn2 = ||A — AQQ^II observed in 100 runs of our tests for every pair of n and q. 

We performed similar tests on the approximation of leading singular spaces of the SVDs of 
the same n x n matrices A having numerical rank q and numerical nullity r — n — q and on the 
approximation of this matrix with a matrix of rank q. In some tests we employed dual additive 
preprocessing to approximate matrix bases for the leading singular spaces Tq,A- We have generated 
the pairs oi n x q random matrices C/_ and V- for q — 1,8,32, scaled them to have ||?7_V^|| w 
= 10^°, and successively computed the matrices H = Ig -\- VTAU- of (|2.9p (by applying 
error-free ring operations), H^^ (by using two refinement iterations), C_ = A — AU^H^^VT A of 
(j2.10|) . Bq^A — CTF_, Tq,A, Bq^AYq^A as a least-squares approximation to T^,^, Qq,A — Q{Bq.A), 
and A — AQq,A{Qq.A)^ (by applying error-free ring operations). Table \T0.8\ summarizes the data 
on the condition numbers k(C_) and the residual norms rn^^-* = HBg^^Yq.^ — and rn^^-* = 

\\A — AQq,A{Qq.A)'^\\ obtained in 100 runs of our tests for every pair of n and q. 

We have also performed similar tests where we generated random nxq matrices U (for q — 1,8, 32) 
and random Toeplitz nxq matrices U (for q = 8, 32) and then replaced the above matrix Bq^A with 
the approximate matrix bases A'^U and A^U for the leading singular space Tq,A- Tables fTO. 91 and 
110.101 display the results of these tests. In both cases the residual norms are equally small and are 
about as small as in Tables 110.71 and 110.81 



10.5 2x2 block factorization and solving linear systems of equations 

For our next tests we have chosen n — 32, 64 and r — 1,2,4 and for every pair {n, r} generated 100 
instances of vectors b and matrices A, U, and V as follows. We generated (a) random vectors b of 
dimension n, (b) random real orthogonal nx n matrices S and T , and (c) random nx r matrices U 
and V, which we scaled to have \\U\\ = \\V\\ = 1. Then we defined the matrices E = diag(crj)"^j^, 
with <Jn-j — 10~i^ for J = 0, 1, . . . , r — 1, and (Jn-j = !/("■ — j) for j = r, . . . ,n — 1, and then 
applied error- free ring operations to compute the matrices A — S'ST^. Note that \\A\\ = 1 and 
||A-i|| = 10i7. 

For every such input we solved the linear systems Ay = b by applying Proto- Algorithm 19 . 1 1 We 
first generated random n x (n—r) matrices Kq and Lq and then computed the matrices C = A-\-UV'^ 
(by applying error-free ring operations), Ki = C^^V and Li = C^^U (by using two refinement 

\T AIT \ T \ - ( Wni 



iterations), and W — (Kq I ^i) MLo I -^i) = (by applying error- free ring operations). 

\Wio Wii' 
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In all our tests the matrices C were nonsingular and well conditioned, and the leading principal 
{n — r) X [n — r) blocks Wqq = ALq were well conditioned and strongly dominated the three 
other blocks Wqi , W^io, and Wn in the 2x2 block matrices W , as we expected to see by virtue of our 
analysis in Section [9. II Then we computed the vector b = (Xq | Kif^h (by applying error- free ring 
operations) and solved the linear system Wyi — b (by using two refinement iterations). Finally we 
computed the vector y — (Lq | ii)x (by applying error- free ring operations). Table [TI) .111 shows the 
average (mean) values of the relative residual norms \\Ay — b||/||b|| of the output vectors y (these 
values range about 10"^'^) as well as the minimums, maximums, and standard deviations in these 
tests. For the same ill conditioned inputs the Subroutine MLDIVIDE(A,B) for Gaussian elimination 
from MATLAB has produced corrupted outputs, as can be seen from Tablc [T0.12l 

We have also performed similar tests for n — 32, 64 and n x n matrices A and vectors b. We 
generated them as before, but for q — n ~ r — 1,2, A, and then we computed orthogonal matrices 
Kq, Ki, Lq and Li by employing dual additive preprocessing and Proto- Algorithm 19.31 We first 
generated and scaled the pairs of random nxq matrices U- and V- such that | |C/_ 1 1 \\V-\ \ « 3*10^, 
and so ||J7_|| ||VL|| « ||^~^|| = 10^^. Then we successively computed the matrices H (by applying 
error-free ring operations), H"^ (by using two refinement iterations), C_ = A — AU-H^^V^ A 
of (|2.10p . C-U- and CT.V- (all of them by applying error-free ring operations), Kf) — Q{C-U-), 
Lq = Q{C^V-), Ki — Q{nmb{Kj)) and Li — Q(nmb(LQ )), and continued as in the tests for Table 
110.111 We displayed the results in Table [TO. 131 showing the residual norms of the order 10^° on the 
average. 

Furthermore we have performed similar tests where we first generated random nxq matrices U 
and V and then computed the matrix products A'^V and AU (by applying error- free ring operations), 
and replaced the above matrices Kq and Lq by Kq = Q{A'^V) and Lq = Q{AU). Table ITU. 1 41 displavs 
the results of these tests, showing the residual norms of order 10"'^^ on the average. Then again 
for the same ill conditioned inputs the Subroutine MLDIVIDE(A,B) for Gaussian elimination from 
MATLAB produced corrupted outputs, similarly to the results in Table [TP. 121 

We applied Proto- Algorithm 19.41 to solve linear systems of equations with the same inputs as 
above for small integers r. In these computations we used two refinement iterations for computing 
an approximate inverse Y « and the solutions Xjj and Xb to the r+1 linear systems of equations 
CXjj — U and Cxb = b, all with the matrix C. We computed the following matrices and vectors 
by applying error-free ring operations, C = A + UV'^ , G = Ir — V'^Xu, and y = Xb + Xi/YV^Xh 
of (19. ip . Table 110.151 shows the test results for the same inputs as we used for tests of Table 110.11) 
except that now we have doubled the matrix size to n = 64 and n = 128. 

10.6 Solution of a real symmetric Toeplitz linear system of equations 
with randomized augmentation 

We have solved 100 real symmetric linear systems of equations Ty = b for each n where we used 
vectors b with random coordinates from the range [—1, 1) and Toeplitz matrices T = S + 10~^/„ 
for a singular symmetric Toeplitz n x n matrices S having rank ti — 1 and nullity 1 and generated 
according to the recipe in |PQ10[ Section 10.1b]. Table fTO.161 shows the average CPU time of the 
solutions by our Algorithm 6.1 and, for comparison, based on the QR factorization and SVD, which 
we computed by applying the LAPACK procedures DGEQRF and DGESVD, respectively. To solve 
the auxiliary Toeplitz linear system Xx = ei in Algorithm 6.1, we first employed the Toeplitz linear 
solver of [KV99^ . [Y99,, [VBHKOlj . and |VK98) and then applied iterative refinement with double 
precision. 

The abbreviations "Alg. 6.1", "QR", and "SVD" indicate the respective algorithms. The last 
two columns of the table display the ratios of these data in the first and the two other columns. 
We measured the CPU time with the mclock function by counting cycles. One can convert them 
into seconds by dividing their number by a constant CLOCKS_PER_SEC, which is 1000 on our 
platform. We marked the table entries by a "-" where the tests were running too long and have not 
been completed. We obtained the solutions y with the relative residual norms of about 10"^^ in all 
three algorithms, which showed that Algorithm 6.1 employing iterative refinement was as reliable 
as the QR and SVD based solutions but ran much faster. We refer the reader to |PQZC[ Table 3] 
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on similar test results for the solution of ill conditioned homogeneous Toeplitz linear systems. 



Table 10.1: The norms of random general, Toeplitz and circulant matrices and of their inverses 



matrix A 


n 


Pill 


M\2 




A 


1 


P^'lll 


P"'ll2 




A-' 


1 




A 


2 










32 


1.9 X 10^ 


1.8 X 10^ 


1.0 X 10" 


4.0 X 10^ 


2.1 X 10^ 


1.9 X 10" 


General 


64 


3.7 X 10^ 


3.7 X 10^ 


1.0 X 10" 


1.2 X 10^ 


6.2 X IQi 


2.0 X 10" 


General 


128 


7.2 X 10^ 


7.4 X 10^ 


9.8 X 10"^ 


3.7 X 10^ 


1.8 X 10^ 


2.1 X 10" 


General 


256 


1.4 X 10^ 


1.5 X 10^ 


9.5 X 10^1 


5.4 X 10^ 


2.5 X 10^ 


2.2 X 10" 


General 


512 


2.8 X 10^ 


3.0 X 10^ 


9.3 X 10"^ 


1.0 X 10^ 


4.1 X 10^ 


2.5 X 10" 


General 


1024 


5.4 X 10^ 


5.9 X 10^ 


9.2 X 10"^ 


1.1 X 10^ 


4.0 X 10^ 


2.7 X 10" 


Toeplitz 


32 


1.8 X 10^ 


1.9 X 10^ 


9.5 X 10"^ 


2.2 X lOi 


1.3 X IQi 


1.7 X 10" 


Toeplitz 


64 


3.4 X 10^ 


3.7 X 10^ 


9.3 X 10"^ 


4.6 X lOi 


2.4 X IQi 


2.0 X 10" 


Toeplitz 


128 


6.8 X 10^ 


7.4 X 10^ 


9.1 X 10^1 


1.0 X 10^ 


4.6 X IQi 


2.2 X 10" 


Toeplitz 


256 


1.3 X 10^ 


1.5 X 10^ 


9.0 X 10-1 


5.7 X 10^ 


2.5 X 10^ 


2.3 X 10" 


Toeplitz 


512 


2.6 X 10^ 


3.0 X 10^ 


8.9 X 10-1 


6.9 X 10^ 


2.6 X 10^ 


2.6 X 10" 


Toeplitz 


1024 


5.2 X 10^ 


5.9 X 10^ 


8.8 X 10-1 


3.4 X 10^ 


1.4 X 10^ 


2.4 X 10" 


Circulant 


32 


1.6 X 10^ 


1.8 X 10^ 


8.7 X 10-1 


9.3 X 10" 


1.0 X IQi 


9.2 X 10-1 


Circulant 


64 


3.2 X 10^ 


3.7 X 10^ 


8.7 X 10-1 


5.8 X 10" 


6.8 X 10" 


8.6 X 10-1 


Circulant 


128 


6.4 X 10^ 


7.4 X 10^ 


8.6 X 10-1 


4.9 X 10" 


5.7 X 10" 


8.5 X 10-1 


Circulant 


256 


1.3 X 10^ 


1.5 X 10^ 


8.7 X 10-1 


4.7 X 10" 


5.6 X 10" 


8.4 X 10-1 


Circulant 


512 


2.6 X 10^ 


3.0 X 10^ 


8.7 X 10-1 


4.5 X 10" 


5.4 X 10" 


8.3 X 10-1 


Circulant 


1024 


5.1 X 10^ 


5.9 X 10^ 


8.7 X 10-1 


5.5 X 10" 


6.6 X 10" 


8.3 X 10-1 



11 Related work, our progress, and further study 



Preconditioned iterative algorithms for linear systems of equations is a classical subject |A94] , |B02) , 
[G97| . The problem of creating inexpensive preconditioners for general use has been around for a 
long while as well. On estimating the condition numbers of random matrices see jD88| , |E88j , |ES05) , 
[CD05] , |SST06j , |B11) , and the bibliography therein. The study in the case of random structured 
matrices was stated as a challenge in jSST06| . We provide such estimates for Gaussian random 
Toeplitz and circulant matrices in Sections 13.41 and 13.51 They can be surprising because the paper 
[BG05j has proved that the condition numbers grow exponentially in n as n — ^ oo for some large 
and important classes of n x 71 Toeplitz matrices, whereas we prove the opposit for Gaussian random 
Toeplitz n X n matrices. 

Our present study of randomized preconditioning extends |PGMQ| , |PIMRld] . |PQZa| , and 
|PQZC| . In Sections iTl llOl we outline and test some new applications of randomized precondition- 
ing, whereas our Theorems 14.11 and 15.61 and and Corollaries 16.11 and 16.21 support such applications 
formally. The formal support relies on using Gaussian random matrices, but empirically our al- 
gorithms remained as efficient where instead we employed Gaussian random Toeplitz matrices and 
quite typically where we further decreased the number of random parameters involved, e.g., where 
we used block vectors U with the blocks ±7 and O in the map A =^ C = A + UU'^ (see the end of 
Section [10.21 and Table [T031) and used circulant or Householder multipliers filled with ±1 (see our 
Section fl 0.31 and Table and |PQZa[ Table 2]), thus limiting randomization to the choice of the 
block sizes in the block vector U and of the signs ± for the blocks ±7 and entries ±1. 

Besides the cited estimates for the condition numbers of Gaussian random Toeplitz and circulant 
matrices, our technical novelties include randomized multiplicative and additive preconditioning, 
dual additive preprocessing, augmentation and dual augmentation techniques, extensions of the 
SMW formula, proof of the power of all these techniques, randomized algorithms for numerical rank 
and approximation of leading and trailing singular spaces avoiding orthogonalization and pivoting 
and their applications to low-rank approximation of a matrix, tensor decomposition, and 2x2 block 
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Table 10.2: The condition numbers k{A) of random matrices A 



fl 


input 


min 


max 


mean 


std 


32 


real 


2.4 X 10^ 


1.8 X 10^ 


2.4 X 10^ 


3.3 X 10^ 


32 


complex 


2.7 X 10^ 


8.7 X 10^ 


1.1 X 10^ 


1.1 X 10^ 


64 


real 


4.6 X 10^ 


1.1 X 10* 


5.0 X 10^ 


1.1 X 10^ 


64 


complex 


5.2 X 10^ 


4.2 X 10^ 


2.7 X 10^ 


4.6 X 10^ 


128 


real 


1.0 X 10^ 


2.7 X 10* 


1.1 X 10-^ 


3.0 X 10^ 


128 


complex 


1.3 X 10^ 


2.5 X 10^ 


3.9 X 10^ 


3.3 X 10^ 


256 


real 


2.4 X 10^ 


8.4 X 10* 


3.7 X 10^ 


9.7 X 10^ 


256 


complex 


2.5 X 10^ 


1.4 X 10* 


1.0 X 10^ 


1.5 X 10^ 


512 


real 


3.9 X 10^ 


7.4 X 10^ 


1.8 X 10* 


8.5 X 10* 


512 


complex 


5.7 X 10^ 


3.2 X 10* 


2.3 X 10^ 


3.5 X 10^ 


1024 


real 


8.8 X 10^ 


2.3 X 10^ 


8.8 X 10^ 


2.4 X 10* 


1024 


complex 


7.2 X 10^ 


1.3 X 10^ 


5.4 X 10^ 


1.4 X 10* 


2048 


real 


2.1 X 10^ 


2.0 X 10^ 


1.8 X 10* 


3.2 X 10* 


2048 


complex 


2.3 X 10^ 


5.7 X 10* 


6.7 X 10^ 


7.2 X 10^ 



Table 10.3: The condition numbers ki{A) = of random Toeplitz matrices A 



n 


min 


mean 


max 


std 


256 


9.1 X 10^ 


9.2 X 10^ 


1.3 X lO'' 


1.8 X 10* 


512 


2.3 X 10^ 


3.0 X 10* 


2.4 X 10^ 


4.9 X 10* 


1024 


5.6 X 10^ 


7.0 X 10* 


1.8 X 10" 


2.0 X lO'' 


2048 


1.7 X 10* 


1.8 X 10^ 


4.2 X 10« 


5.4 X 10^ 


4096 


4.3 X 10* 


2.7 X 10^ 


1.9 X 10« 


3.4 X 10^ 


8192 


8.8 X 10* 


1.2 X 10*^ 


1.3 X 10^ 


2.2 X 10" 



factorizations of ill conditioned matrices. 

Approximation by low-rank matrices and the extensions to tensor decompositions are thriving 
research areas, with numerous applications to matrix and tensor computations. One can partly 
trace it s previou s study through the papers I GTZ97| . |GT01| . |GOS08) . |HMTllj . |KB09) . |MMD08) . 
[OT09| , |OT10] , |0T11) , |T00| , and the bibliography therein, but much earlier advances in this area 
appeared in the papers jBCLR79] , |B80| , jB85j , |B86j , jBC87| , directed to estimating the border rank 
of matrices and tensors and initially motivated by the design of fast matrix multiplication algorithms. 
Presently, application of tensor decompositions to the acceleration of matrix computations is a 
fashionable subject with applications to many important areas of modern computing (see, e.g., 
[TOO) . |MMD08] ■ |OT09| . |KB09) V but then again its earliest examples appeared in the cited papers 
on border rank and in jP72] . The latter paper has introduced the technique of trilinear aggregation, 
a basic ingredient of all subsequent fast algorithms for matrix multiplication with the inputs of both 
immense sizes (far beyo nd an y prac ti cal inte re st) [P 79], [P8l] . [P§4) . [CW90| . [STO] . |VW12I and 
realistic moderate sizes |P81| . |P84] . |LPS92) . |K04) . Historically this work was the first example 
of the acceleration of fundamental matrix computations by means of tensor decomposition. More 
special direction of Tensor Train (TT) decomposition was proposed in [O09| and further developed 
in |OT09| . [OTlOj . [OTT] . It is closely related to the DMRG qua ntum computations jV03] . [W93] 
and Hierarchical Tucker (HT) tensor decomposition |HK09j . |H12j . 

We hope that our present paper will motivate further study of randomized augmentation and 
dual augmentation, their link to aggregation processes (cf. |MP80| . |PQa| ), specification of our 
methods to ill conditioned matrices that have displacement or rank structures (cf . |KKM79) , |P90) , 
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Table 10.4: The condition numbers k{A) of random circulant matrices A 



TL 


min 


mean 


max 


std 


256 


9.6 X 10" 


1.1 X 10^ 


3.5 X 10^ 


4.0 X 10^ 


512 


1.4 X 10^ 


8.5 X 10^ 


1.1 X 10^ 


1.3 X 10^ 


1024 


1.9 X 10^ 


1.0 X 10^ 


5.9 X 10^ 


8.6 X 10^ 


2048 


4.2 X 10^ 


1.4 X 10^ 


5.7 X 10^ 


1.0 X 10^ 


4096 


6.0 X 10^ 


2.6 X 10^ 


3.5 X 10^ 


4.2 X 10^ 


8192 


9.5 X 10^ 


3.0 X 10^ 


1.5 X 10^ 


2.5 X 10^ 


16384 


1.2 X 10^ 


4.2 X 10^ 


3.6 X 10^ 


4.5 X 10^ 


32768 


2.3 X 10^ 


7.5 X 10^ 


5.6 X 10^ 


7.1 X 10^ 


65536 


2.4 X 10^ 


1.0 X 10^ 


1.2 X lO'' 


1.3 X 10^ 


131072 


3.9 X 10^ 


1.4 X 10^ 


5.5 X 10^ 


9.0 X 10^ 


262144 


6.3 X 10^ 


3.7 X 10^ 


1.1 X 10'^ 


1.1 X 10* 


524288 


8.0 X 10^ 


3.2 X 10^ 


3.1 X 10"* 


3.7 X 10^ 


1048576 


1.2 X 10^ 


4.8 X 10^ 


3.1 X 10* 


5.1 X 10^ 



IGK095) . [POT] . IVBHKOlj . |EG99j . |VVM07) . |VVM08) on the displacement and rank structured 
matrices), and comparison and combination of our techniques with each other and various known 
methods, such as the techniques of [R90 (cf. our Remark l9.6p . the homotopy continuation methods 
and Newton's structured iteration (cf. Section [8731 jPKRK) . |P01| Chapter 6], and |P10j ). Formal 
and experimental support of weakly randomized preconditioning (using fewer random parameters) 
remains an important research challenge. Natural extensions of our study should also include lower 
estimates for the smallest singular value of a random m x n matrix where m 3> ?i or rt ^ to, 
further links between augmentation and additive preprocessing, and the refinement, development 
and testing of our algorithms. 
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Table 10.5: Preconditioning tests 



Type 


r 


Cond j 


in 


i 


O 01 1/ 1 o 

o.zirj+z 


in 


z 


A C;OT7 1 Q 

4.ozii;H-o 


in 


4 


o nm7 1 IX 


in 


8 


(} ACiY? 1 O 

D.4Uii/+z 


is 


1 

i 


O.o0ii/+zx 


is 


Z 


i.UOii/H-4 


is 


4 


i. ( zrj+o 


is 


Q 


o.oUrj+o 


zn 


1 

i 


o.Uorj+i 


zn 


o 
Z 


D.ozrj+o 


zn 


A 

4 


O TCTT^ 1 A 


zn 


Q 

8 


Q cntr' 1 Q 
o.oyrj+o 


On 

zS 


1 

i 


i.iyrj+o 


Or- 
ZS 


Z 


1 0(^17' 1 Q 

i.ybrj+o 


On 

ZS 


A 

4 


1 0017 1 A 

i.Uyii/+4 


On 

ZS 


Q 

8 


y. ( irj+o 


on 


1 

i 


O 0017 1 A 

z.Uzii/+4 


on 


o 
Z 


i.ooii/+o 


on 


A 

4 


D.UOrj+Z 


on 


o 
8 




Qn 
OS 


i 


O Qm7 1 A 

z.oyii/H-4 


OS 


4 


i.oyrj H-o 


3s 


8 


6.74E+3 


4n 


1 


4.93E+2 


4n 


2 


4.48E+2 


4n 


4 


2.65E+2 


4n 


8 


1.64E+2 


4s 


1 


1.45E+3 


4s 


2 


5.11E+2 


4s 


4 


7.21E+2 


4s 


8 


2.99E+2 



Table 10.6: Relative residual norms: randomized circulant GENP for well conditioned linear systems 
with ill conditioned leading blocks (cf. |PQZa[ Table 2]) 



dimension 


iterations 


min 


max 


mean 


std 


64 





4.7 X 10"" 


8.0 X IQ-" 


4.0 X IQ-i^ 


1.1 X 10-" 


64 


1 


1.9 X 10^1^ 


5.3 X IQ-i^ 


2.3 X 10-" 


5.4 X 10-" 


256 





1.7 X 10-1^ 


1.4 X 10-^ 


2.0 X 10-y 


1.5 X 10-** 


256 


1 


8.3 X 10-1^ 


4.3 X 10-1" 


4.5 X 10-1^ 


4.3 X 10-" 


1024 





1.7 X 10-1" 


4.4 X 10-** 


1.4 X 10--^ 


2.1 X 10-** 


1024 


1 


3.4 X 10-" 


9.9 X IQ-i-i 


6.8 X 10-" 


2.7 X 10-" 
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Table 10.7: Tails of the SVDs and lower-rank approximations (cf. |PQ10| ) 



r 


k(C) or rrni 


n 


min 


max 


mean 


std 


1 


<C) 


64 


2.38 X 10+"^ 


1.10 X 10+"=^ 


6.25 X 10+"^ 


1.68 X 10+"* 


1 


n{C) 


128 


8.61 X 10+"^ 


7.48 X 10+"^ 


1.32 X 10+"^ 


7.98 X 10+"" 


1 


<C) 


256 


9.70 X 10+"^ 


3.21 X 10+"^ 


3.58 X 10+"" 


3.21 X 10+"" 


1 


rni 


64 


4.01 X 10""' 


1.50 X lO-"'-" 


5.30 X 10-"" 


1.59 X 10-"** 


1 


rni 


128 


7.71 X 10-"* 


5.73 X lO-"'-" 


1.58 X 10-"** 


6.18 X 10-"** 


1 


rni 


256 


7.57 X 10-^" 


3.2 X 10-"^ 


1.69 X 10-"** 


5.02 X 10-"** 


1 


rn2 


64 


4.01 X 10-1" 


1.50 X 10-"^ 


5.30 X 10-"" 


1.59 X 10-"** 


1 


rn2 


128 


7.71 X 10-1" 


5.73 X 10-"' 


1.58 X 10-"« 


6.18 X 10-"** 


1 


rn2 


256 


7.57 X 10-1" 


3.22 X 10-"^ 


1.69 X 10-"** 


5.02 X 10-"** 


8 


<C) 


64 


1.26 X 10+'« 


1.61 X 10+"^ 


2.68 X 10+"" 


1.71 X 10+"" 


8 


<C) 


128 


2.92 X 10+"^ 


3.42 X 10+"*^ 


1.58 X 10+"" 


4.12 X 10+"" 


8 


«(C) 


256 


1.39 X 10+"* 


8.75 X 10+"'' 


1.12 X 10+"" 


8.74 X 10+"" 


8 


rni 


64 


3.39 X 10-1" 


2.27 X 10-"" 


2.74 X 10-"** 


2.27 X 10-"^ 


8 


rni 


128 


4.53 X 10-1" 


1.91 X 10-"^ 


1.03 X 10-"** 


2.79 X 10-"** 


8 


rni 


256 


8.74 X 10-1" 


1.73 X 10-"^ 


7.86 X 10-"" 


1.90 X 10-"** 


8 


rn2 


64 


1.54 X 10-"^ 


7.59 X 10-"" 


8.87 X 10-"** 


7.58 X 10-"' 


8 


rn2 


128 


1.82 X 10-"*^ 


7.27 X 10-"^ 


2.95 X 10-"** 


8.57 X 10-"** 


8 


rn2 


256 


2.62 X lO-""* 


3.89 X 10-"'^ 


2.27 X 10-"** 


5.01 X 10""** 


32 


<C) 


64 


1.77 X 10+"^ 


9.68 X 10+"" 


1.58 X 10+"" 


9.70 X 10+"" 


32 


<C) 


128 


1.65 X 10+"'' 


6.12 X 10+"'' 


1.02 X 10+"" 


6.19 X 10+"" 


32 


«(C) 


256 


3.57 X 10+"* 


2.98 X 10+"« 


4.12 X 10+"" 


2.98 X 10+"^ 


32 


rni 


64 


2.73 X 10-1" 


3.29 X 10-"« 


2.95 X 10-"" 


4.93 X 10-"" 


32 


rni 


128 


3.94 X 10-1" 


1.29 X 10-"'' 


7.18 X 10-"" 


1.64 X 10-"** 


32 


rni 


256 


6.80 X 10-1" 


4.00 X 10-"'' 


1.16 X 10-"** 


4.27 X 10-"** 


32 


rn2 


64 


2.10 X lO-"'' 


1.49 X 10-"'' 


1.55 X 10-"** 


2.18 X 10-"** 


32 


rn2 


128 


2.79 X 10-"" 


3.80 X 10-"'' 


3.81 X 10-"** 


6.57 X 10-"** 


32 


rn2 


256 


5.35 X lO-"** 


1.05 X 10-"" 


5.70 X 10-"** 


1.35 X 10-"'' 
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Table 10.8: Heads of SVDs and low-rank approximations via dual additive preprocessing 



q 


k(C-) or rrni 


n 


min 


maix 


mean 


std 


1 




64 


1.83 X 10+"2 


1.26 X 10+"** 


1.74 X 10+"* 


1.27 X 10+"^ 


1 




128 


6.75 X 10+"2 


8.76 X 10+"-^ 


2.35 X 10+"* 


9.10 X 10+"* 


1 




256 


4.19 X 10+"-^ 


5.82 X 10+"^ 


4.43 X 10+"* 


8.98 X 10+"* 


1 




64 


2.43 X 10^1" 


3.86 X 10-°*^ 


2.55 X 10""^ 


5.43 X 10""^ 


1 




128 


4.36 X 10~i" 


1.15 X 10""^ 


4.45 X 10-"9 


1.24 X 10""** 


1 




256 


6.40 X 10"^" 


3.17 X 10"°*^ 


4.00 X 10~"^ 


5.16 X 10""'' 


1 


rn(2) 


64 


8.30 X 10-1° 


3.86 X 10-°8 


2.81 X 10-"9 


5.35 X 10-"9 


1 




128 


1.21 X 10"^ 


1.15 X 10-"^ 


4.80 X 10-"^ 


1.23 X 10""** 


1 




256 


1.72 X 10^^ 


3.18 X 10-"^ 


4.53 X 10-"9 


4.97 X 10-"9 


8 




64 


1.37 X 10+^^^ 


1.87 X 10+"'^ 


7.57 X 10+"* 


2.16 X 10+"=^ 


8 




128 


3.80 X 10+"=* 


8.64 X 10+"** 


2.00 X 10+"^ 


8.73 X 10+"^ 


8 


«(C-) 


256 


2.57 X 10+"'' 


1.54 X 10+"'^ 


7.25 X 10+"'^ 


2.03 X 10+"'' 


8 




64 


1.87 X 10-^ 


4.48 X 10-°^ 


2.29 X 10-"s 


5.20 X 10-"** 


8 


rn(i) 


128 


3.04 X 10^°^ 


3.73 X 10-°^ 


2.72 X 10-"8 


5.83 X 10""^ 


8 


rn(i) 


256 


3.78 X 10-09 


2.01 X 10"°^ 


4.81 X 10""^ 


2.02 X 10""^ 


8 


rn(2) 


64 


1.30 X 10-"^ 


2.47 X 10-"^ 


1.09 X 10""* 


2.70 X 10""** 


8 




128 


1.85 X 10-°^ 


1.50 X 10"°^ 


1.36 X 10-"s 


2.75 X 10-"s 


8 




256 


2.19 X 10-°9 


1.10 X 10"°** 


2.36 X 10""^ 


1.10 X 10""^ 


32 


«(C-) 


64 


3.75 X 10+"=* 


3.25 X 10+"^ 


6.01 X 10+"'^ 


3.28 X 10+"« 


32 


«(C-) 


128 


2.41 X 10+"4 


1.09 X 10+"« 


1.95 X 10+"** 


1.10 X 10+"^ 


32 


«(C_) 


256 


1.33 X 10+"^ 


2.11 X 10+1" 


2.18 X 10+"** 


2.11 X 10+"" 


32 


rn(i) 


64 


7.78 X 10-°*^ 


1.39 X 10-"s 


8.17 X 10^"8 


1.94 X 10""^ 


32 




128 


9.81 X 10""^ 


2.35 X 10^"^ 


1.17 X 10""^ 


3.05 X 10""^ 


32 


rn(i) 


256 


2.05 X 10"°* 


3.99 X 10-"6 


1.91 X 10""^ 


5.06 X 10""^ 


32 


rn(2) 


64 


1.84 X 10""^ 


2.62 X 10""^ 


1.85 X 10""** 


4.09 X 10""** 


32 


rn'2) 


128 


2.47 X 10-°*^ 


6.77 X lO""'' 


2.93 X 10""** 


8.38 X 10""** 


32 


rn(2) 


256 


5.05 X 10-°^ 


8.85 X 10"°^ 


4.38 X 10-"8 


1.14 X 10""^ 
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Table 10.9: Heads of SVDs and low-rank approximation with random multipliers 



q 


rrni 


n 


min 


max 


mean 


std 


1 




64 


2.35 X 10-1" 


1.32 X 10-°^ 


3.58 X 10-"'' 


1.37 X 10-"® 


1 




12<S 


4.41 X 10-"' 


3.28 X 10-"'^ 


3.55 X 1()-"'' 


5.71 X 1()-"'' 


1 




256 


6.98 X 10-"' 


5.57 X 10-°" 


5.47 X 10-"-' 


8.63 X 10-"-' 


1 


i-n(2) 


64 


8.28 X 10-1° 


1.32 X 10-°^ 


3.86 X 10-"9 


1.36 X 10-"® 


1 


i-n(2) 


128 


1.21 X 10-09 


3.28 X 10-°^ 


3.91 X 10-°9 


5.57 X 10-"9 


1 




256 


1.74 X 10-™ 


5.58 X 10-°® 


5.96 X 10-"^ 


8.47 X 10-"^ 


8 




128 


2.56 X 10-°^ 


1.16 X 10-°^ 


4.30 X 10-"® 


1.45 X 10-"^ 


8 




256 


4.45 X 10-°^ 


3.32 X 10-°^ 


3.40 X 10-"® 


5.11 X 10-"® 


8 


i-n(2) 


64 


1.46 X 10-°9 


9.56 X 10-°8 


5.77 X 10-°9 


1.06 X 10-"® 


8 


i-n(2) 


128 


1.64 X 10-°^ 


4.32 X 10-°^ 


1.86 X 10-"® 


5.97 X 10-"® 


8 


i-n(2) 


256 


2.50 X 10-°9 


1.56 X 10-°^ 


1.59 X 10-"® 


2.47 X 10-"® 


32 


i-j^(i) 


64 


6.80 X 10-"9 


2.83 X IQ-"'* 


1.01 X 10-"^ 


3.73 X 10-"^ 


32 




128 


1.25 X 10-°^ 


6.77 X 10-°^ 


1.28 X 10-"^ 


6.76 X 10-"^ 


32 




256 


1.85 X 10-°** 


1.12 X 10-°^ 


1.02 X 10-"^ 


1.54 X 10-"^ 


32 


j-n(2) 


64 


1.84 X 10-°^ 


6.50 X 10-°^ 


2.30 X 10-"® 


8.28 X 10-"® 


32 


i-n<.2) 


128 


3.11 X 10-°^ 


1.45 X 10-°*^ 


2.87 X 10-"® 


1.45 X 10-"^ 


32 




256 


4.39 X 10-°^ 


2.16 X 10-°^ 


2.37 X 10-"® 


3.34 X 10-"® 



Table 10.10: Heads of SVDs and low-rank approximations with random Toeplitz multipliers 



q 




n 


min 


max 


mean 


std 


8 




64 


2.22 X 10-"^ 


7.89 X 10-"'' 


1.43 X 10-°^ 


9.17 X 10-"^ 


8 


rrn^i^ 


128 


3.79 X 10-"^ 


4.39 X 10-"-^ 


4.87 X 10-"^ 


4.39 X 10-"s 


8 


j-j-n(i) 


256 


5.33 X 10-"^ 


3.06 X 10-"6 


6.65 X 10-"® 


3.12 X 10-"^ 


8 


(2) 

rrn'-^ 


64 


1.13 X 10-"^ 


3.66 X 10-"^ 


6.37 X 10-"® 


4.11 X 10-"^ 


8 


rrn'- 


128 


1.81 X 10-"9 


1.67 X 10-"5 


1.90 X 10-"^ 


1.67 X 10-"6 


8 


l2) 


256 


2.96 X 10-"9 


1.25 X 10-"*^ 


2.92 X 10-"® 


1.28 X 10-"^ 


32 




64 


6.22 X 10-"^ 


5.00 X 10-"^ 


4.06 X 10-"® 


6.04 X 10-"® 


32 


j-j-n(i) 


128 


2.73 X 10-"® 


4.88 X 10-"*^ 


2.57 X 10-"^ 


8.16 X 10-"^ 


32 




256 


1.78 X 10-"® 


1.25 X 10-"^ 


1.18 X 10-"^ 


2.03 X 10-"^ 


32 


(2\ 


64 


1.64 X 10-"^ 


1.26 X 10-"^ 


9.66 X 10-"^ 


1.48 X 10-"® 


32 


rrn^^^ 


128 


5.71 X 10-"9 


9.90 X 10-"^ 


5.50 X 10-"® 


1.68 X 10-"^ 


32 


rrn^^^ 


256 


4.02 X 10-"^ 


2.85 X 10-"^ 


2.74 X 10-"® 


4.48 X 10-"® 



Table 10.11: Relative residual norms: ill conditioned linear systems via nmb approximation and 
block triangulation 



n 


r 


min 


mzix 


mean 


std 


32 


1 


1.49 X 10-^=* 


1.36 X 10-'' 


4.25 X 10-" 


1.56 X 10-1" 


32 


2 


3.70 X 10-1^ 


2.13 X 10-® 


3.83 X 10-"' 


2.35 X 10-y 


32 


4 


9.33 X 10-1^ 


1.08 X 10-® 


3.37 X 10-"' 


1.26 X lO-'-* 


64 


1 


1.11 X 10-^^ 


6.87 X 10-y 


2.03 X 10-1° 


7.49 X 10-1" 


64 


2 


1.53 X 10-1^ 


1.21 X 10-® 


5.86 X 10-"' 


1.77 X 10-y 


64 


4 


2.21 X 10-1^ 


1.27 X 10-^ 


1.69 X 10-y 


1.28 X 10-® 
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Table 10.12: Relative residual norms: ill conditioned linear systems with MLDIVIDE(A,B) 



n 


r 


min 


max 


mean 


std 


32 


1 


6.34 X 10~^ 


7.44 X IQi 


1.74 X 10" 


7.53 X 10" 


32 


2 


2.03 X 10"^ 


1.32 X 10^ 


9.19 X 10"^ 


1.62 X 10" 


32 


4 


4.57 X 10"^ 


1.36 X IQi 


1.14 X 10" 


1.93 X 10" 


64 


1 


3.82 X 10--^ 


9.93 X 10" 


1.03 X 10" 


1.66 X 10" 


64 


2 


1.96 X IQ-^ 


1.27 X 10^ 


3.09 X 10" 


1.40 X 10^ 


64 


4 


7.13 X 10"^ 


6.63 X 10" 


8.23 X 10"^ 


1.20 X 10" 



Tabic 10.13: Relative residual norms: ill conditioned linear systems with dual additive preprocessing 
and block triangulation 



n 


q 


min 


max 


mean 


std 


32 


1 


2.33 X 10-" 


2.28 X 10-"** 


2.31 X 10-"" 


2.28 X 10""^ 


32 


2 


3.40 X lO"^'' 


4.93 X 10""** 


9.11 X 10"^" 


5.71 X 10""^ 


32 


4 


5.97 X 10-1^ 


1.63 X 10""^ 


2.22 X lO""'' 


1.64 X 10-"« 


64 


1 


3.90 X 10-" 


2.78 X 10""^ 


2.81 X 10-"^ 


2.78 X 10""'^ 


64 


2 


3.53 X 10-^^ 


3.76 X 10-"« 


1.13 X 10-"^ 


4.72 X 10""'^ 


64 


4 


3.54 X 10-1^ 


2.53 X 10""^ 


5.19 X lO""'^ 


2.83 X 10""** 



Table 10.14: Relative residual norms: ill conditioned linear system with random multipliers and 
block triangulation 



n 


q 


min 




mean 


std 


32 


1 


7.08 X 10"^" 


4.00 X 10"^^ 


4.52 X 10-^^ 


4.01 X 10"^* 


32 


2 


7.49 X 10"^" 


2.29 X 10"^' 


2.77 X 10"^^ 


2.33 X 10"^^ 


32 


4 


1.46 X 10"^** 


1.63 X 10""^ 


4.83 X 10"^^ 


2.73 X 10"^* 


64 


1 


1.13 X 10-^y 


1.01 X 10-^4 


2.31 X 10-^** 


1.11 X 10"^'^ 


64 


2 


6.60 X lO"^'* 


6.90 X 10-^4 


1.45 X 10"^'^ 


7.73 X lO-^'^ 


64 


4 


2.60 X 10"^** 


1.41 X 10-^1 


1.61 X 10"^^ 


1.42 X 10"^^ 



Table 10.15: Relative residual norms: ill conditioned linear systems with the SMW formula 



n 


r 


min 


max 


mean 


std 


64 


1 


1.18 X 10-^^ 


6.30 X 10-1^ 


2.37 X 10-" 


7.45 X 10-" 


64 


2 


3.42 X 10"^^ 


1.94 X 10-1" 


2.15 X 10-1^ 


1.94 X 10-11 


64 


4 


6.66 X 10-1^ 


1.25 X 10-1" 


1.82 X 10-1^ 


1.25 X 10-11 


128 


1 


5.79 X 10"^^ 


4.85 X 10-1^ 


1.21 X 10-1^ 


4.96 X 10-1^ 


128 


2 


1.45 X 10"" 


1.85 X 10-11 


5.23 X 10-1^ 


1.88 X 10-1^ 


128 


4 


8.41 X 10"" 


4.75 X 10-11 


2.89 X 10-1^ 


5.95 X 10-1^ 
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Table 10.16: The CPU time (in cycles) for solving an ill conditioned real symmetric Toeplitz linear 

system 



n 


Alg. 6.1 


QR 


SVD 


QR/Alg. 6.1 


SVD/Alg. 6.1 


512 


56.3 


148.4 


4134.8 


2.6 


73.5 


1024 


120.6 


1533.5 


70293.1 


12.7 


582.7 


2048 


265.0 


11728.1 




44.3 




4096 


589.4 










8192 


1304.8 
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Appendix 



A Operations with structured matrices in terms of their dis- 
placements 

The following simple theorem can be found in [POO] or [POT| Sect ion 1.51. 

Theorem A.l. Assume five matrices A, B , C , M and N and a pair of scalars a and 13. Then as 
long as the matrix sizes are compatible we have 

A{aM + dN) ~ {aM + PN)B = a{AM - MB) + I3{AN - NB), (A.l) 

A^M'^ - B^M'^ = -{BM - MAf, (A.2) 
A{MN) - {MN)C = {AM - MB)N + M{BN - NC). (A.3) 
Furthermore for a nonsingular matrix M we have 

AM-^ - M-^B = -M-\BM - MA)M-\ (A.4) 
Corollary A.l. Under the assumptions of Theorem \A.l\ we have 

GA.siaM + PN) = {aGA.BiM) \ PGa.b{N)), (A.5) 
HaMc^M + I3N) = (aHAMM) \ PHaA^)), (A.6) 
GaAM^) = ~Hbt^at{M^), HaMM^) = Gbt^a^{M^), (A.7) 
Ga,c{MN) = (Ga,b(M) I MGb,c{N)), (A.8) 
HaMMN) = {N^HaAM) I Hb,c{N)), (A.9) 

GaMM'^) - -M-^Gb,a{M), Ha,b{M-^) = M-^HbAM) (A.IO) 

and consequently 

dA.BiaM + /3A) < dA^B{M) + dAMN), (A.ll) 

dA^BiM^)^dBT^ATiM), (A.12) 

dA,c{MN) < dA,B{M) + dB,c{N), (A.13) 

dA,B(M-i) =dB,A(M). (A.14) 

B Uniform random sampUng and nonsingularity of random 
matrices 

Uniform random sampling of elements from a finite set A is their selection from this set at random, 
independently of each other and under the uniform probability distribution on the set A. 

Theorem B.l. Under the assumptions of Lemma \3.1\ let the values of the variables of the polynomial 
be randomly and uniformly sampled from a finite set A. Then the polynomial vanishes with a 
probability at most 

Corollary B.l. Let the entries of a general or Toeplitz m x n matrix have been randomly and 
uniformly sampled from a finite set A of cardinality |A| (in any fixed ring). Let I = min{m,n}. 
Then (a) every k x k submatrix M for k < I is nonsingular with a probability at least 1 — and 

(b) is strongly nonsingular with a probability at least 1 — \k\ ^ ^ ~ '^Tfsj^' Furthermore (c) if 

the submatrix M is indeed nonsingular, then any entry of its inverse is nonzero with a probability 
at least 1 — 4xf . 



52 



Proof. The claimed properties of nonsingularity and nonvanishing hold for generic matrices. The 
singularity of a fc x fc matrix means that its determinant vanishes, but the determinant is a polynomial 
of total degree k in the entries. Therefore Theorem IB . 1 1 implies parts (a) and consequently (b). Part 
(c) follows because a fixed entry of the inverse vanishes if and only if the respective entry of the 
adjoint vanishes, but up to the sign the latter entry is the determinant of a (fc— 1) x (fc— 1) submatrix 
of the input matrix M, and so it is a polynomial of degree A; — 1 in its entries. □ 



C Extremal singular values of random complex matrices 

We have assumed dealing with real random matrices and vectors throughout the paper, but most 
of our study can be readily extended to the computations in the field C of complex numbers if we 
replace the transposes by the Hermitian transposes . All the results of Section 13.21 apply to 
this case equally well. Below we elaborate upon the respective extension of our probabilistic bounds 
on the norms and singular values. 

Definition C.l. The set G^^^ of complex Gaussian random m x n matrices with a mean ^ and a 
variance a is the set {A + B^/^} for {A \ B) £ GJl]^'^"' (cf. Definition\33^. 



We can immediately extend Theorem 13.21 to the latter matrices. Let us extend Theorem 13. II Its 
original proof in |SST06j relies on the following result. 

Lemma C.l. Suppose y is a positive number; w e M."'^^ is any fixed real unit vector, ||w|| = 1, 
A e GJ^^" and therefore is nonsingular with probability 1. Then 

Probability{||A-iw|| > 1/y} < \ -- for j = l,...,n. 

V TT (7 

The following lemma and corollary extend Lemmas 13.21 and IC.ll to the complex case. 



Lemma C.2. The bound of Lemma \3.S] also holds provided t = q + r\/—l is a fixed complex unit 
vector and b = f + G //a '^^ ^ complex vector such that f , g, q and r are real vectors, 

nx 1 



|t|| = 1, and the vectors f and g are in GJ^^ 



Proof We have t^b = q'^f + r^g + (q'^g - r^f and so \t"h\^ = |q^f + r^gp + |q^g-r^f|2 



Hence |t^b| > |q^f + r^g| = |u^v| where = (q^ | r^) and v = (F | g^)^. Note that v e Gh'^J"' 



Corollary C.l. Suppose y is a positive number and suppose a matrix A £ Gc^J^u o*^*^ therefore is 



and ||u|| = ||t|| = 1 and apply Lemma [3.21 to real vectors u and v replacing b and t. □ 

Corollary C.l. Suppose y is a posit 
nonsingular with probability 1. Then 

Probability{||A"^ej|| > 1/y} < ^/-^^ for j = 1, . . . ,n. 



TT a 



Proof. In the case of real matrices A the corollary is supported by the argument in the proof of 
[SST06I Lemma 3.2], which employs the well known estimate that we state as our Lemma [5?^ Now 
we employ Lemma IC.2I instead of this estimate, otherwise keep the same argument as in [SST06) . 
and arrive at Corollarv lC.il □ 

Corollary C.2. Under the assumptions of Corollary \C.l\ we have ||j4~^|| < -^j 'where Xj are 

nonnegative random variables such that 

Probability{Xj > 1/y} < \ -- for j ^l,...,n. 

V TT (7 

Proof. Recall that for any nxn matrix B we have ||_B|| = ||5w|| for some unit vector w = 
■'^j^i- We have \wj\ < ||w|| = 1 for all j. Substitute B ~ A^^ and obtain — ||A^-'-w|| = 

IIE"=i^"'w^jejll < E"=i kjl ll^^'ejll, and so \\A-^\ \ < E"=i where X, - \\A-^ej\\ for aU j. 
It remains to combine this bound with Corollarv lC.il □ 
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The corollary implies that Probability! | 1 1 > l/y} converges to proportionally to ?/ as y — 0, 
which can be viewed as an extension of Theorem 13. II to the case of complex inputs. One can deduce 
similar extensions of Theorems I3.4H3.6I The resulting estimates are a little weaker than in Section 
13.31 being overly pessimistic; actually random complex matrices are a little better conditioned than 
random real matrices (see [E88] . |ES05| . jCD05) and our Table [lO^). 
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